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NOVEL NUCLEIC ACIDS AND POLYPEPTIDES 

1. TECHNICAL FIELD 

The present invention provides novel polynucleotides and proteins encoded by such 
polynucleotides, along with uses for these polynucleotides and proteins, for example in 
therapeutic, diagnostic and research methods. 



2. BACKGROUND 

Technology aimed at the discovery of protein factors (including e.g. , cytokines, such as 

10 lymphokines, interferons, CSFs, chemokines, and interleukins) has matured rapidly over the past 
decade. The now routine hybridization cloning and expression cloning techniques clone novel 
polynucleotides "directly" in the sense that they rely on information directly related to the 
discovered protein (i.e., partial DNA/amino acid sequence of the protein in the case of 
hybridization cloning; activity of the protein in the case of expression cloning). More recent 

1 5 "indirect" cloning techniques such as signal sequence cloning, which isolates DNA sequences 
based on the presence of a now well-recognized secretory leader sequence motif, as well as 
various PCR-based or low stringency hybridization-based cloning techniques, have advanced the 
state of the art by making available large numbers of DNA/amino acid sequences for proteins 
that are known to have biological activity, for example, by virtue of their secreted nature in the 

20 case of leader sequence cloning, by virtue of their cell or tissue source in the case of PCR-based 
techniques, or by virtue of structural similarity to other genes of known biological activity. 

Identified polynucleotide and polypeptide sequences have numerous applications in, for 
example, diagnostics, forensics, gene mapping; identification of mutations responsible for 
genetic disorders or other traits, to assess biodiversity, and to produce many other types of data 

25 and products dependent on DNA and amino acid sequences. 

3. SUMMARY OF THE INVENTION 

The compositions of the present invention include novel isolated polypeptides, novel 
isolated polynucleotides encoding such polypeptides, including recombinant DNA molecules, 
30 cloned genes or degenerate variants thereof, especially naturally occurrmg variants such as allelic 
variants, antisense polynucleotide molecules, and antibodies that specifically recognize one or more 
epitopes present on such polypeptides, as well as hybridomas producing such antibodies. 

The compositions of the present invention additionally include vectors, including expression 
vectors, containing the polynucleotides of the invention, cells genetically engineered to contain such 
35 polynucleotides and cells genetically engineered to express such polynucleotides. 



15 



25 



30 



WO 01/57188 PCT/USOl/03800 

The present invention relates to a collection or library of at least one novel nucleic acid 
sequenceassembledfrom expressed sequence tags (ESTs) isolated mainly by sequencing by 
hybridization(SBH) J and in some cases, sequences obtained from one or more public databases. 
The inventionrelatesalsotothe proteins encoded by such polynucleotides, along with therapeutic, 
5 diagnosticandresearchutffltiesfo^^ These nucleic acid 

* — °— -"x^.w. 1 uc polypeptides sequences are designated SEQ 

ID NO: 1351-2700. The nucleic acids and polypeptides are provided in the Sequence Listing. In 
thenucleicacidsprovidedmm^ 

myminejandNisanyofthe four bases. In the amino acids provided in the Sequence Listing, * 
10 corresponds to the stop codon. 

The nucleic acid sequences of the present invention also include, nucleic acid sequences that 
hybridize to the complement of SEQ ID NO:1-1350 under siringent hybridization conditions; 
nucleic acid sequenceswhich are allehc variants or species homologuesofanyofthe nucleic acid 
sequencesrecitedabove, or nucle^^ 

domainorliuncationofthepeptidesencodedbySEQIDNO:l-1350. A polynucleotide 
comprismganucleotidesequenw^^ 

IDNO.-1-1350 oradegeneratevariantorfragmentthereof. The identifying sequence can be 100 
base pairs in length. 

The nucleic acid sequences of the present invention also include the sequence information ; 

frommenucleicacidsequencesofSEQIDNO:14350.The S equenceinformationcanbea 
segment of my one of SEQ m NO:1-1350 that um^ 
informationof SEQ H)NO:1-1350. 

A collectionas used in this application can be a collection of only one polynucleotide. Hie 
collectionofsequericeinforr^^ 

a nucleic acid array. In one embodiment, segments of sequence information is provided on a 
nucleicaddarraytodetectmepolynucleotidethatcontaiiism^ 

to detectfull-matehor mismatch to (he polynucleotidethat contains the segment. The collection 
can also be provided in a computer-readableformat 

This invention also includes the reverse or direct complement of any of (he nucleic acid 
sequencesreched above; cloning or expression vectors containingthe nucleic acid sequences; and 
hostcells or organisms transformed with these expression vectors. Nucleic acid sequences (or their 
reverse or direct complements) according to the invention have numerous applications in a variety 
of techniquesknown to those skilledin the art of molecular biology, such as use as hybridization 
probes, use as primers for PCR, use in an airay, use mcomputer-re^blemema, use in sequencing 
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full-length genes, use for chromosome and gene mapping, use in the recombinant production of 
protein, and use in the generation of anti-sense DNA or RNA, their chemical analogs and the like. 

In a preferred embodiment, the nucleic acid sequences of SEQ ID NO: 1 -1 350 or novel 
segments or parts of the nucleic acids of the invention are used as primers in expression assays that 
5 are well known in the art. In a particularly preferred embodiment, the nucleic acid sequences of 
SEQ ID NO: 1-1 350 or novel segments or parts of the nucleic acids provided herein are used in 
diagnostics for identifying expressed genes or, as well known in the art and exemplifiedby Vollrath 
et al., Science 258:52-59 (1992), as expressed sequence tags for physical mapping of the human 
genome. 

10 The isolated polynucleotides of the invention include, but are not limited to, a 

polynucleotide comprising any one of the nucleotide sequences set forth in SEQ ED NO: 1-1350; a 
polynucleotide comprising any of the full length protein coding sequences of SEQ IDNO:l - 1350; 
and a polynucleotide comprising any of the nucleotide sequences of the mature protein coding 
sequences of SEQ ED NO: 1- 1350. The polynucleotides ofthe present invention also include, but 

15 are not limited to, a polynucleotidethat hybridizes under stringent hybridization conditions to (a) 
the complement of any one ofthe nucleotide sequences set forth in SEQ ID NO: 1-1 350; (b) a 
nucleotide sequence encoding any one of the amino acid sequences set forth in the Sequence Listing 
(e.g 9 SEQ ID NO: 1351-2700); (c) a polynucleotide which is an allelic variant of any 
polynucleotides recited above; (d) a polynucleotide which encodes a species homolog (e.g. 

20 orthologs) of any of the proteins recited above; or (e) a polynucleotide that encodes a polypeptide 
comprising a specific domain or truncation of any of the polypeptides comprising an amino acid 
sequence set forth in the Sequence Listing. 

The isolated polypeptides ofthe invention include, but are not limited to, a polypeptide 
comprising any ofthe amino acid sequences set forth in the Sequence Listing; or the corresponding 

25 full length or mature protein. Polypeptides ofthe invention also include polypeptides with 

biological activity that are encoded by (a) any ofthe polynucleotides having a nucleotide sequence 
set forth in SEQ ID NO: 1- 1 350; or (b) polynucleotides that hybridize to the complement of the 
polynucleotidesof (a) under stringent hybridization conditions. Biologically or immunologically 
active variants of any ofthe polypeptide sequences in the Sequence Listing, and "substantial 

30 equivalents"thereof (e.g., with at least about 65%, 70%, 75%, 80%, 85%, 90%, 95%, 98% or 99% 
ammo acid sequence identity) that preferably retain biological activity are also contemplated. The 
polypeptides ofthe inventionmay be wholly or partially chemically synthesized but are preferably 
produced by recombinant means using the genetically engineered cells (e.g. host cells) ofthe 
invention. 
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The invention also provides compositions comprising a polypeptide of the invention. 
Polypeptide compositions of the invention may further comprise an acceptable carrier, such as a 
hydrophilic, e.g., pharmaceutical^ acceptable, carrier. 

The invention also provides host cells transformed or transfected with a polynucleotide of 
5 the invention. 

The invention also relates to methods for producing a polypeptide of the invention 
comprising growing a culture of the host cells of the invention in a suitable culture medium 
under conditions permitting expression of the desired polypeptide, and purifying the polypeptide 
from the culture or from the host cells. Preferred embodiments include those in which the 
1 0 protein produced by such process is a mature form of the protein. 

Polynucleotides according to the invention have numerous applications in a variety of 
techniques known to those skilled in the art of molecular biology. These techniques include use 
as hybridization probes, use as oligomers, or primers, for PCR, use for chromosome and gene 
mapping, use in the recombinant production of protein, and use in generation of anti-sense DNA 
15 or RNA, their chemical analogs and the like. For example, when the expression of an mRNA is 
largely restricted to a particular cell or tissue type, polynucleotides of the invention can be used 
as hybridization probes to detect the presence of the particular cell or tissue mRNA in a sample 
using, e.g., in situ hybridization. 

In other exemplary embodiments, the polynucleotides are used in diagnostics as 
20 expressed sequence tags for identifying expressed genes or, as well known in the art and 
exemplified by Vollrath et ah, Science 258:52-59 (1992), as expressed sequence tags for 
physical mapping of the human genome. 

The polypeptides according to the invention can be used in a variety of conventional 
procedures and methods that are currently applied to other proteins. For example, a polypeptide 
25 of the invention can be used to generate an antibody that specifically binds the polypeptide. Such 
antibodies, particularly monoclonal antibodies, are useful for detecting or quantitating the 
polypeptide in tissue. The polypeptides of the invention can also be used as molecular weight 
markers, and as a food supplement. 

Methods are also provided for preventing, treating, or ameliorating a medical condition 
30 which comprises the step of administering to a mammalian subject a therapeutically effective 
amount of a composition comprising a polypeptide of the present invention and a 
pharmaceutically acceptable carrier. 

In particular, the polypeptides and polynucleotides of the invention can be utilized, for 
example, in methods for the prevention and/or treatment of disorders involving aberrant protein 
35 expression or biological activity. 
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The present invention further relates to methods for detecting the presence of the 
polynucleotides or polypeptides of the invention in a sample. Such methods can, for example, b 
utilized as part of prognostic and diagnostic evaluation of disorders as recited herein and for the 
identification of subjects exhibiting a predisposition to such conditions. The invention provides 

5 a method for detecting the polynucleotides of the invention in a sample, comprising contacting 
the sample with a compound that binds to and forms a complex with the polynucleotide of 
interest for a period sufficient to form the complex and under conditions sufficient to form a 
complex and detecting the complex such that if a complex is detected, the polynucleotide of 
interest is detected. The invention also provides a method for detecting the polypeptides of the 

1 0 invention in a sample comprising contacting the sample with a compound that binds to and form 
a complex with the polypeptide under conditions and for a period sufficient to form the complex 
and detecting the formation of the complex such that if a complex is formed, the polypeptide is 
detected. 

The invention also provides kits comprising polynucleotide probes and/or monoclonal 
15 antibodies, and optionally quantitative standards, for carrying out methods of the invention. 
Furthermore, the invention provides methods for evaluating the efficacy of drugs, and 
monitoring the progress of patients, involved in clinical trials for the treatment of disorders as 
recited above. 

The invention also provides methods for the identification of compounds that modulate 

20 increase or decrease) the expression or activity of the polynucleotides and/or polypeptides 

of the invention. Such methods can be utilized, for example, for the identification of compound: 
that can ameliorate symptoms of disorders as recited herein. Such methods can include, but are 
not limited to, assays for identifying compounds and other substances that interact with (e.g., 
bind to) the polypeptides of the invention. The invention provides a method for identifying a 

25 compound that binds to the polypeptides of the invention comprising contacting the compound 
with a polypeptide of the invention in a cell for a time sufficient to form a 
polypeptide/compound complex, wherein the complex drives expression of a reporter gene 
sequence in the cell; and detecting the complex by detecting the reporter gene sequence 
expression such that if expression of the reporter gene is detected the compound the binds to a 

30 polypeptide of the invention is identified. 

The methods of the invention also provides methods for treatment which involve the 
administration of the polynucleotides or polypeptides of the invention to individuals exhibiting 
symptoms or tendencies. In addition, the invention encompasses methods for treating diseases o 
disorders as recited herein comprising administering compounds and other substances that 

35 modulate the overall activity of the target gene products. Compounds and other substances can 
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effect such modulation either on the level of target gene/protein expression or target protein 
activity. 

The polypeptides of the present invention and the polynucleotides encoding them are also 
useful for the same functions known to one of skill in the art as the polypeptides and 
polynucleotides to which they have homology (set forth in Table 2). If no homology is set forth 
for a seauence. then the nnlvne.ntifW *nA a-p+u* — + : ^ * 

a variety of applications, as described herein, including use in arrays for detection. 



1 0 4. DETAILED DESCRIPTION OF THE INVENTION 

4.1 DEFINITIONS 

It must be noted that as used herein and in the appended claims, the singular forms V 5 
"an" and "the" include plural references unless the context clearly dictates otherwise. 
15 The term "active" refers to those forms of the polypeptide which retain the biologic 

and/or immunologic activities of any naturally occurring polypeptide. According to the 
invention, the terms "biologically active" or "biological activity" refer to a protein or peptide 
having structural, regulatory or biochemical functions of a naturally occurring molecule. 
Likewise "immunologically active" or "immunological activity" refers to the capability of the 
20 natural, recombinant or synthetic polypeptide to induce a specific immune response in 
appropriate animals or cells and to bind with specific antibodies. 

The term "activated cells" as used in this application are those cells which are engaged in 
extracellular or intracellular membrane trafficking, including the export of secretory or 
enzymatic molecules as part of a normal or disease process. 
25 The terms "complementary" or "complementarity" refer to the natural binding of 

polynucleotides by base pairing. For example, the sequence 5'-AGT-3' binds to the 
complementary sequence 3'-TCA-5\ Complementarity between two single-stranded molecules 
may be "partial" such that only some of the nucleic acids bind or it may be "complete" such that 
total complementarity exists between the single stranded molecules. The degree of 
30 complementarity between the nucleic acid strands has significant effects on the efficiency and 
strength of the hybridization between the nucleic acid strands. 

The term "embryonic stem cells (ES)" refers to a cell that can-give rise to many 
differentiated cell types in an embryo or an adult, including the germ cells. The term "germ line 
stem cells (GSCs)" refers to stem cells derived from primordial stem cells that provide a steady 
35 and continuous source of germ cells for the production of gametes. The term primordial germ 
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cells (PGCs)" refers to a small population of cells set aside from other cell lineages particularly • 
from the yolk sac, mesenteries, or gonadal ridges during emhryogenesis that have the potential to 
differentiate into germ cells and other cells. PGCs are the source from which GSCs and ES cells 
are derived The PGCs, the GSCs and the ES cells are capable of self-renewal. Thus these cells 

5 not only populate the germ line and give rise to a plurality of terminally differentiated cells that 
comprise the adult specialized organs, but are able to regenerate themselves. 

The term "expression modulating fragment, " EMF, means a series of nucleotides which 
modulates the expression of an operably linked ORF or another EMF. 

As used herein, a sequence is said to "modulate the expression of an operably linked 

10 sequence" when the expression of the sequence is altered by the presence of the EMF. EMFs 
include, but are not limited to, promoters, and promoter modulating sequences (inducible 
elements). One class of EMFs are nucleic acid fragments which induce the expression of an 
operably linked ORF in response to a specific regulatory factor or physiological event. 
The terms "nucleotide sequence" or "nucleic acid" or "polynucleotide" or 

1 5 "oligonculeotide" are used interchangeably and refer to a heteropolymer of nucleotides or the 

sequence of these nucleotides. These phrases also refer to DNA or RNA of genomic or synthetic 
origin which may be single-stranded or double-stranded and may represent the sense or the 
antisense strand, to peptide nucleic acid (PNA) or to any DNA-like or KNA-like material. In the 
sequences herein A is adenine, C is cytosine, T is thymine, G is guanine and N is A, C, G or T 

20 (U). It is contemplated that where the polynucleotide is RNA, the T (thymine) in the sequences 
provided herein is substituted with U (uracil). Generally, nucleic acid segments provided by this 
invention may be assembled from fragments of the genome and short oligonucleotide linkers, or 
from a series of oligonucleotides, or from individual nucleotides, to provide a synthetic nucleic 
acid which is capable of being expressed in a recombinant transcriptional unit comprising 

25 regulatory elements derived from a microbial or viral operon, or a eukaryotic gene. 

The terms "oligonucleotide fragment" or a "polynucleotide fragment", "portion," or 
"segment" or "probe" or "primer" are used interchangeably and refer to a sequence of nucleotide 
residues which are at least about 5 nucleotides, more preferably at least about 7 nucleotides, 
more preferably at least about 9 nucleotides, more preferably at least about 1 1 nucleotides and 

30 most preferably at least about 17 nucleotides. The fragment is preferably less than about 500 
nucleotides, preferably less than about 200 nucleotides, more preferably less than about 100 
nucleotides, more preferably less than about 50 nucleotides and most preferably less than 30 
nucleotides. Preferably the probe is from about 6 nucleotides to about 200 nucleotides, 
preferably from about 15 to about 50 nucleotides, more preferably from about 17 to 30 

35 nucleotides and most preferably from about 20 to 25 nucleotides. Preferably the fragments can 
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be used in polymerase chain reaction (PCR), various hybridization procedures or microarray 
procedures to identify or amplify identical or related parts ofmKNA or DNA molecules. A 
fragment or segment may uniquely identify each polynucleotide sequence of the present 
invention. Preferably the fragment comprises a sequence substantially similar to any one of SEQ 
5 IDNOs:l-1350. 

Probes may, for example, be used to determine whether specific mRNA molecules are 
present in a cell or tissue or to isolate similar nucleic acid sequences from chromosomal DNA as 
described by Walsh et al. (Walsh, P.S. et al„ 1 992, PCR Methods Appl 1 :241-250). They may 
be labeled by nick translation, Klenow fill-in reaction, PCR, or other methods well known in the 
10 art. Probes of the present invention, their preparation and/or labeling are elaborated in 
Sambrook, J. et al, 1989, Molecular Cloning: A Laboratory Manual, Cold Spring Harbor 
Laboratory, NY; or Ausubel, F.M. et al., 1989, Current Protocols in Molecular Biology, John 
Wiley & Sons, New York NY, both of which are incorporated herein by reference in their 
entirety. 

15 The nucleic acid sequences of the present invention also include the sequence 

information from the nucleic acid sequences of SEQ ID NO:M350. The sequence information 
can be a segment of any one of SEQ ID NO: 1-1350 that uniquely identifies or represents the 
sequence information of that sequence of SEQ ID NO:1-1350. One such segment can be a 
twenty-mer nucleic acid sequence because the probability that a twenty-mer is fully matched in 

20 the human genome is 1 in 300. In the human genome, there are three billion base pairs in one set 
of chromosomes. Because 4 20 possible twenty-mers exist, there are 300 times more twenty-mers. 
than there are base pairs in a set of human chromosomes. Using the same analysis, the 
probability for a seventeen-mer to be fully matched in the human genome is approximately 1 in 
5. When these segments are used in arrays for expression studies, fifteen-mer segments can be 

25 used. The probability that the fifteen-mer is folly matched in the expressed sequences is also 
approximately one in five because expressed sequences comprise less than approximately 5% of 
the entire genome sequence. 

Similarly, when using sequence information for detecting a single mismatch, a segment can 
be a twenty-fivemer. The probability that the twenty-five mer would appear in a human genome 

30 with a single mismatch is calculated by multiplying the probability for a full match (1+4 25 ) times the 
increasedprobabilityformismatchateachnucleotideposition(3 x25). The probability that an 
eighteen mer with a single mismatch can be detected in an array for expression studies is 
approximately one in five. The probability that a twenty-mer with a single mismatch can be 
detected in a human genome is approximately one in five. 
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The term "open reading frame," ORP, means a series of nucleotide triplets coding for 
amino acids without any tennination codons and is a sequence translatable into protein. 

The terms "operably linked" or "operably associated" refer to functionally related nucleic 
acid sequences. For example, a promoter is operably associated or operably linked with a coding 
5 sequence if the promoter controls the transcription of the coding sequence. While operably 
linked nucleic acid sequences can be contiguous and in the same reading frame, certain genetic 
elements e.g. repressor genes are not contiguously linked to the coding sequence but still control 
transcription/translation of the coding sequence. 

The term "pluripotent" refers to the capability of a cell to differentiate into a number of 
1 0 differentiated cell types that are present in an adult organism. A pluripotent cell is restricted in its 
differentiation capability in comparison to a totipotent cell. 

The terms "polypeptide" or "peptide" or "amino acid sequence" refer to an oligopeptide, 
peptide, polypeptide or protein sequence or fragment thereof and to naturally occurring or 
synthetic molecules. A polypeptide "fragment," "portion," or "segment" is a stretch of amino 
15 acid residues of at least about 5 amino acids, preferably at least about 7 amino acids, more 
preferably at least about 9 amino acids and most preferably at least about 17 or more amino 
acids. The peptide preferably is not greater than about 200 amino acids, more preferably less 
than 150 amino acids and most preferably less than 100 amino acids. Preferably the peptide is 
from about 5 to about 200 amino acids. To be active, any polypeptide must have sufficient 
20 length to display biological and/or immunological activity. 

The term "naturally occurring polypeptide" refers to polypeptides produced by cells that, 
have not been genetically engineered and specifically contemplates various polypeptides arising 
from post-translational modifications of the polypeptide including, but not limited to, 
acetylation, carboxylation, glycosylation, phosphorylation, lipidation and acylation. 
25 The term "translated protein coding portion" means a sequence which encodes for the full 

length protein which may include any leader sequence or any processing sequence. 

The term "mature protein coding sequence" means a sequence which encodes a peptide 
or protein without a signal or leader sequence. The "mature protein portion" means that portion 
of the protein which does not include a signal or leader sequence. The peptide may have been 
30 produced by processing in the cell which removes any leader/signal sequence. The mature 
protein portion may or may not include the initial methionine residue. The methionine residue 
may be removed from the protein during processing in the cell. The peptide may be produced 
synthetically or the protein may have been produced using a polynucleotide only encoding for 
the mature protein coding sequence. 
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The term "derivative" refers to polypeptides chemically modified by such techniques as 
ubiquitination, labeling (e.g., with radionuclides or various enzymes), covalent polymer 
attachment such as pegylation (derivatization with polyethylene glycol) and insertion or 
substitution by chemical synthesis of amino acids such as ornithine, which do not normally occur 
5 in human proteins. 

The term "variant"(or "analog") refers to any polypeptide differing from naturally 
occurring polypeptides by amino acid insertions, deletions, and substitutions, created using, e g., 
recombinant DNA techniques. Guidance in determining which amino acid residues may be 
replaced, added or deleted without abolishing activities of interest, may be found by comparing 
10 the sequence of the particular polypeptide with that of homologous peptides and minimizing the 
number of amino acid sequence changes made in regions of high homology (conserved regions) 
or by replacing amino acids with consensus sequence. 

Alternatively, recombinant variants encoding these same or similar polypeptides may be 
synthesized or selected by making use of the "redundancy" in the genetic code, Various codon 
1 5 substitutions, such as the silent changes which produce various restriction sites, may be 
introduced to optimize cloning into a plasmid or viral vector or expression in a particular 
prokaryotic or eukaryotic system. Mutations in the polynucleotide sequence may be reflected in 
the polypeptide or domains of other peptides added to the polypeptide to modify the properties of 
any part of the polypeptide, to change characteristics such as ligand-binding affinities, interchain 
20 affinities, or degradation/turnover rate. 

Preferably, amino acid "substitutions" are the result of replacing one amino acid with 
another amino acid having similar structural and/or chemical properties, i.e., conservative amino 
acid replacements. "Conservative" amino acid substitutions may be made on the basis of 
similarity in polarity, charge, solubility, hydrophobicity, hydrophilicity, and/or the amphipathic 
nature of the residues involved. For example, nonpolar (hydrophobic) amino acids include 
alanine, leucine, isoleucine, valine, proline, phenylalanine, tryptophan, and methionine; polar 
neutral amino acids include glycine, serine, threonine, cysteine, tyrosine, asparagine, and 
glutamine; positively charged (basic) amino acids include arginine, lysine, and histidine; and 
negatively charged (acidic) amino acids include aspartic acid and glutamic acid. "Insertions" or 
"deletions" are preferably in the range of about 1 to 20 amino acids, more preferably 1 to 10 
amino acids. The variation allowed may be experimentally determined by systematically making 
insertions, deletions, or substitutions of amino acids in a polypeptide molecule using 
recombinant DNA techniques and assaying the resulting recombinant variants for activity. 

Alternatively, where alteration of function is desired, insertions, deletions or 
non-conservative alterations can be engineered to produce altered polypeptides. Such alterations 
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can, for example, alter one or more of the biological functions or biochemical characteristics of 
the polypeptides of the invention. For example, such alterations may change polypeptide 
characteristics such as ligand-binding affinities, interchain affinities, or degradation/turnover 
rate. Further, such alterations can be selected so as to generate polypeptides that are better suited 

5 for expression, scale up and the like in the host cells chosen for expression. For example, 
cysteine residues can be deleted or substituted with another amino acid residue in order to 
eliminate disulfide bridges. 

The terms "purified" or "substantially purified" as used herein denotes that the indicated 
nucleic acid or polypeptide is present in the substantial absence of other biological 

10 macromolecules, e.g., polynucleotides, proteins, and the like. In one embodiment, the 

polynucleotide or polypeptide is purified such that it constitutes at least 95% by weight, more 
preferably at least 99% by weight, of the indicated biological macromolecules present (but water, 
buffers, and other small molecules, especially molecules having a molecular weight of less than 
1000 daltons, can be present). 

15 The term "isolated" as used herein refers to a nucleic acid or polypeptide separated from 

at least one other component (e.g., nucleic acid or polypeptide) present with the nucleic acid or 
polypeptide in it's natural source. In one embodiment, the nucleic acid or polypeptide is found in 
the presence of (if anything) only a solvent, buffer, ion, or other component normally present in a 
solution of the same. The terms "isolated" and "purified" do not encompass nucleic acids or 

20 polypeptides present in their natural source. 

The term "recombinant/ when used herein to refer to a polypeptide or protein, means 
that a polypeptide or protein is derived from recombinant (e.g.. microbial, insect, or mammalian) 
expression systems. "Microbial" refers to recombinant polypeptides or proteins made in 
bacterial or fungal (e.g., yeast) expression systems. As a product, "recombinant microbial" 

25 defines a polypeptide or protein essentially free of native endogenous substances and 

unaccompanied by associated native glycosylation. Polypeptides or proteins expressed in most 
bacterial cultures, e.g., E. coli, will be free of glycosylation modifications; polypeptides or 
proteins expressed in yeast will have a glycosylation pattern in general different from those 
expressed in mammalian cells. 

30 The term "recombinant expression vehicle or vector" refers to a plasmid or phage or virus 

or vector, for expressing a polypeptide from a DNA (RNA) sequence. An expression vehicle can 
comprise a transcriptional unit comprising an assembly of (1) a genetic element or elements 
having a regulatory role in gene expression, for example, promoters or enhancers, (2) a structural 
or coding sequence which is transcribed into mRNA and translated into protein, and (3) 

35 appropriate transcription initiation and termination sequences. Structural units intended for use 
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in yeast or eukaryotic expression systems preferably include a leader sequence enabling 
extracellular secretion of translated protein by a host cell. Alternatively, where recombinant 
protein is expressed without a leader or transport sequence, it may include an amino terminal 
methionine residue. This residue may or may not be subsequently cleaved from the expressed 
5 recombinant protein to provide a final product 

The term "recombinant expression system" means host cells which have- stably integrated 
a recombinant transcriptional unit into chromosomal DNA or carry the recombinant 
transcriptional unit extrachromosomally. Recombinant expression systems as defined herein will 
express heterologous polypeptides or proteins upon induction of the regulatory elements linked 
10 to the DNA segment or synthetic gene to be expressed. This term also means host cells which 
have stably integrated a recombinant genetic element or elements having a regulatory role in 
gene expression, for example, promoters or enhancers. Recombinant expression systems as 
defined herein will express polypeptides or proteins endogenous to the cell upon induction of the 
regulatory elements linked to the endogenous DNA segment or gene to be expressed. The cells 
15 can be prokaryotic or eukaryotic. 

The term "secreted" includes a protein that is transported across or through a membrane, 
including transport as a result of signal sequences in its amino acid sequence when it is 
expressed in a suitable host cell. "Secreted" proteins include without limitation proteins secreted 
wholly {e.g., soluble proteins) or partially (e.g, receptors) from the cell in which they are 
20 expressed. "Secreted" proteins also include without limitation proteins that are transported 
across the membrane of the endoplasmic reticulum. "Secreted" proteins arc also intended to 
include proteins containing non-typical signal sequences (e.g. Interleukin-1 Beta, see Krasney, 
P.A and Young, P.R. (1992) Cytokine 4(2):134 -143) and factors released from damaged cells 
(e.g. Interleukin-1 Receptor Antagonist, see Arend, WJ>. ct. al. (1998) Annu. Rev. Immunol. 
25 16:27-55) 

Where desired, an expression vector may be designed to contain a "signal or leader 
sequence" which will direct the polypeptide through the membrane of a cell. Such a sequence 
may be naturally present on the polypeptides of the present invention or provided from 
heterologous protein sources by recombinant DNA techniques. 

30 The term "stringent" is used to refer to conditions that are commonly understood in the 

art as stringent. Stringent conditions can include highly stringent conditions {U, hybridization 
to filter-bound DNA in 0.5 M NaHP0 4) 7% sodium dodecyl sulfate (SDS), 1 mM EDTA at 
65°C, and washing in 0.1X SSC/0.1% SDS at 68°C), and moderately stringent conditions (Le., 
washing in 0.2X SSC/0.1 % SDS at 42°C). Other exemplary hybridization conditions are 

35 described herein in the examples. 
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In instances of hybridization of deoxyoligonucleotides, additional exemplary stringent 
hybridization conditions include washing in 6X SSC/0.05% sodium pyrophosphate at 37°C (for 
14-base oligonucleotides), 48°C (for 17-base oligos), 55°C (for 20-base oligonucleotides), and 
60°C (for 23-base oligonucleotides). 
5 As used herein, "substantially equivalent" can refer both to nucleotide and amino acid 

sequences, for example a mutant sequence, that varies from a reference sequence by one or more 
substitutions, deletions, or additions, the net effect of which does not result in an adverse 
functional dissimilarity between the reference and subject sequences. Typically, such a 
substantially equivalent sequence varies from one of those listed herein by no more than about 
10 35% (i.e 9 the number of individual residue substitutions, additions, and/or deletions in a 

substantially equivalent sequence, as compared to the corresponding reference sequence, divided 
by the total number of residues in the substantially equivalent sequence is about 0.35 or less). 
Such a sequence is said to have 65% sequence identity to the listed sequence. In one • 
embodiment, a substantially equivalent, e.g., mutant, sequence of the invention varies from a 
15 listed sequence by no more than 30% (70% sequence identity); in a variation of this 

embodiment, by no more than 25% (75% sequence identity); and in a further variation of this 
embodiment, by no more than 20% (80% sequence identity) and in a further variation of this 
embodiment, by no more than 1 0% (90% sequence identity) and in a further variation of this 
embodiment, by no more that 5% (95% sequence identity). Substantially equivalent, e.g., 
20 mutant, amino acid sequences according to the invention preferably have at least 80% sequence 
identity with a listed amino acid sequence, more preferably at least 85% sequence identity, more 
preferably at least 90% sequence identity, more preferably at least 95% identity, more preferably 
at least 98% identity, and most preferably at least 99% identity. Substantially equivalent 
nucleotide sequences of the invention can have lower percent sequence identities, taking into 
25 account, for example, the redundancy or degeneracy of the genetic code. Preferably, nucleotide 
sequence has at least about 65% identity, more preferably at least about 75% identity, more 
preferably at least about 80% sequence identity, more preferably at least about 85% sequence 
identity, more preferably at least about 90% sequence identity, and most preferably at least about 
95% identity, more preferably at least about 98% sequence identity, and most preferably at least 
30 about 99% sequence identity. For the purposes of the present invention, sequences having 

substantially equivalent biological activity and substantially equivalent expression characteristics 
are considered substantially equivalent. For the purposes of determining equivalence, truncation 
of the mature sequence (e.g., via a mutation which creates a spurious stop codon) should be 
disregarded. Sequence identity may be determined, e.g., using the Jotun Hein method (Hein, J . 
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(1990) Methods Enzymol. 183:626-645). Identity between sequences can also be detennined by 
other methods known in the art, e.g. by varying hybridization conditions. 

The term 'totipotent" refers to the capability of a cell to differentiate into all of the cell 
types of an adult organism. < 

The term "transformation" means introducing DNA into a suitable host cell so that the 
DNA is replicable, either as an extrachromosomal element or by chromosomal integration. The 
term "transfection" refers to the taking up of an expression vector by a suitable host cell, whether 
or not any coding sequences are in feet expressed. The term "infection" refers to the introduction 
of nucleic acids into a suitable host cell by use of a virus or viral vector. 

As used herein, an "uptake modulating fragment," UMF, means a series of nucleotides 
which mediate the uptake of a linked DNA fragment into a cell. UMFs can be readily identified 
using known UMFs as a target sequence or target motif with the computer-based systems 
described below. The presence and activity of a UMF can be confirmed by attaching the 
suspected UMF to a marker sequence. The resulting nucleic acid molecule is then incubated 
15 with an appropriate host under appropriate conditions and the uptake of the marker sequence is 
detennined. As described above, a UMF will increase the frequency of uptake of a linked 
marker sequence. 

Each of the above terras is meant to encompass all that is described for each, unless the 
. context dictates otherwise. 

20 

4.2 NUCLEIC ACIDS OF THE INVENTION 

Nucleotide sequences of the invention are set forth in the Sequence Listing. 
The isolated polynucleotides of the invention include a polynucleotide comprising the 
nucleotide sequences ofSEQ ID NO:1-1350 ; a polynucleotide encoding any one of the peptide 

25 sequences of SEQ ID NO:1351-2700; and a polynucleotide comprising the nucleotide sequence 
encoding the mature protein coding sequence of the polypeptides of any one of SEQ ID 
NO: 1351-2700. The polynucleotides of the present invention also include, but are not limited to, 
a polynucleotide that hybridizes under stringent conditions to (a) the complement of any of the 
nucleotides sequences of SEQ ID NOrl-1350 ; (b) nucleotide sequences encoding any one of the 

30 amino acid sequences set forth in the Sequence Listing; (c) a polynucleotide which is an allelic 
variant of any polynucleotide recited above; (d) a polynucleotide which encodes a species 
homolog of any of the proteins recited above; or (e) a polynucleotide that encodes a polypeptide 
comprising a specific domain or truncation of the polypeptides of SEQ ID NO: 1351-2700. 
Domains of interest may depend on the nature of the encoded polypeptide; e.g., domains in 

35 receptor-like polypeptides include ligand-binding, extracellular, transmembrane, or cytoplasmic 
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domains, or combinations thereof; domains in immunoglobulin-like proteins include the variable 
immunoglobulin-like domains; domains in enzyme-like polypeptides include catalytic and 
substrate binding domains; and domains in ligand polypeptides include receptor-binding 
domains. 

5 The polynucleotides of the invention include naturally occurring or wholly or partially 

synthetic DNA, e.g., cDNA and genomic DNA, and RNA, e.g., mRNA. The polynucleotides 
may include all of the coding region of the cDNA or may represent a portion of the coding 
region of the cDNA. 

The present invention also provides genes corresponding to the cDNA sequences disclosed 
10 herein. The corresponding genes can be isolated in accordance with known methods using the 
sequence information disclosed herein. Such methods include the preparation of probes or primers 
from the disclosed sequence informationfor identification and/or amplification of genes in 
appropriate genomic libraries or other sources of genomic materials. Further 5' and 3 1 sequence can 
be obtained using methods known in the art. For example, full length cDN A or genomic DNA that 
1 5 corresponds to any of the polynucleotides of SEQ ID NO: 1 - 1 350 can be obtained by screening 
appropriate cDNA or genomic DNA libraries under suitable hybridization conditions using any of 
the polynucleotides of SEQ ID NO:1-1350 or a portion thereof as a probe. Alternatively, the 
polynucleotidesof SEQ ID NO:1-1350 may be used as the basis for suitable primer(s) that allow 
identification and/or amplification of genes in appropriate genomic DNA or cDNA libraries. 
20 The nucleic acid sequences of the invention can be assembled from ESTs and sequences 

(including cDNA and genomic sequences) obtained from one or more public databases, such as 
dbEST, gbpri, and UniGene. The EST sequences can provide identifying sequence information, 
representative fragment or segment information, or novel segment information for the full-length 
gene. 

25 The polynucleotides of the invention also provide polynucleotides including nucleotide 

sequences that are substantially equivalent to the polynucleotides recited above. Polynucleotides 
according to the invention can have, e.g., at least about 65%, at least about 70%, at least about 
75%, at least about 80%, 81%, 82%, 83%, 84%, more typically at least about 85%, 86%, 87%, 
88%, 89%, more typically at least about 90%, 91 %, 92%, 93%, 94%, and even more typically at 

30 least about 95%, 96%, 97%, 98%, 99%, sequence identity to a polynucleotide recited above. 

Included within the scope of the nucleic acid sequences of the invention are nucleic acid 
sequence fragments that hybridize under stringent conditions to any of the nucleotide sequences 
of SEQ ID NO:1-1350, or complements thereof, which fragment is greater than about 5 
nucleotides, preferably 7 nucleotides, more preferably greater than 9 nucleotides and most 

35 preferably greater than 17 nucleotides. Fragments of, e.g. 15, 17, or 20 nucleotides or more that 
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are selective for (Le. specifically hybridize to any one of the polynucleotides of the invention) 
are contemplated. Probes capable of specifically hybridizing to a polynucleotide can 
differentiate polynucleotide sequences of the invention ftom other polynucleotide sequences in 
the same family of genes or can differentiate human genes from genes of other species, and are 
5 preferably based on unique nucleotide sequences. 

The sequences falling within the scope of the present invention are not limited to these 
specific sequences, but also include allelic and species variations thereof. Allelic and species 
variations can be routinely determined by comparing the sequence provided SEQ ID NO : 1 « 1 3 50, a 
representative fragment thereof, or a nucleotide sequence at least 90% identical, preferably 95% 

10 identical, to SEQ ID NO: 1-1350 with a sequence from another isolate of the same species. 
Furthermore, to accommodate codon variability, the invention includes nucleic acid molecules 
coding for the same amino acid sequences as do the specific ORFs disclosed herein. In other words, 
in the coding region of an ORF, substitution of one codon for another codon that encodes the same 
amino acid is expressly contemplated 

15 The nearest neighbor or homology result for the nucleic acids of the present invention, 

including SEQ ID NO; 1-1350, can be obtained by searchinga database using an algorithm or a 
program. Preferably, a BLAST which stands for Basic Local Alignment Search Tool is used to 
search for local sequence alignments (Altshul, SJ7. J Mol. Evol. 36 290-300 (1 993) and Altschul 
S.R et al. J. Mol. Biol. 21 :403-410 (1990)). Alternatively a FASTA version 3 search against 

20 Genpept, using Fastxy algorithm. 

Species homologs (or orthologs) of the disclosed polynucleotides and proteins are also 
provided by the present invention. Species homologs may be isolated and identified by making 
suitable probes or primers from the sequences provided herein and screening a suitable nucleic 
acid source from the desired species. 

25 The invention also encompasses allelic variants of the disclosed polynucleotides or 

proteins; that is, naturally-occurring alternative forms of the isolated polynucleotide which also 
encode proteins which are identical, homologous or related to that encoded by the 
polynucleotides. 

The nucleic acid sequences of the invention arc further directed to sequences which 
30 encode variants of the described nucleic acids. Hiese amino acid sequence variants may be 
prepared by methods known in the art by introducing appropriate nucleotide changes into a 
native or variant polynucleotide. There are two variables in the construction of amino acid 
sequence variants: the location of the mutation and the nature of the mutation. Nucleic acids 
encoding the amino acid sequence variants are preferably constructed by mutating the 
3 5 polynucleotide to encode an amino acid sequence that does not occur in nature. These nucleic 
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acid alterations can be made at sites that differ in the nucleic acids from different species 
(variable positions) or in highly conserved regions (constant regions). Sites at such locations 
will typically be modified in series, e.g., by substituting first with conservative choices {e.g., 
hydrophobic amino acid to a different hydrophobic amino acid) and then with more distant 

5 choices (e.g. , hydrophobic amino acid to a charged amino acid), and then deletions or insertions 
may be made at the target site. Amino acid sequence deletions generally range from about 1 to 
30 residues, preferably about 1 to 10 residues, and are typically contiguous. Amino acid 
insertions include amino- and/or carboxyl-terminal fusions ranging in length from one to one 
hundred or more residues, as well as intrasequence insertions of single or multiple amino acid 

10 residues. Intrasequence insertions may range generally from about 1 to 10 amino residues, 

preferably from 1 to 5 residues. Examples of terminal insertions include the heterologous signal 
sequences necessary for secretion or for intracellular targeting in different host cells and 
sequences such as FLAG or poly-histidine sequences useful for purifying the expressed protein. 
In a preferred method, polynucleotides encoding the novel amino acid sequences are 

15 changed via site-directed mutagenesis. This method uses oligonucleotide sequences to alter a 
polynucleotide to encode the desired amino acid variant, as well as sufficient adjacent 
nucleotides on both sides of the changed amino acid to form a stable duplex on either side of the 
site of being changed In general, the techniques of site-directed mutagenesis are well known to 
those of skill in the art and this technique is exemplified by publications such as, Edelman et al., 

20 DNA 2:183 (1983). A versatile and efficient method for producing site-specific changes in a 

polynucleotide sequence was published by Zoller and Smith, Nucleic Acids Res. 1 0:6487-6500 , 
(1982). PCR may also be used to create amino acid sequence variants of the novel nucleic acids. 
When small amounts of template DNA are used as starting material, primer(s) that differs 
slightly in sequence from the corresponding region in the template DNA can generate the desired 

25 amino acid variant. PCR amplification results in a population of product DNA fragments that 
differ from the polynucleotide template encoding the polypeptide at the position specified by the 
primer. The product DNA fragments replace the corresponding region in the plasmid and this 
gives a polynucleotide encoding the desired amino acid variant. 

A further technique for generating amino acid variants is the cassette mutagenesis 

30 technique described in Wells et al., Gene 34:315 (1985); and other mutagenesis techniques well 
known in the art, such as, for example, the techniques in Sambrook et al., supra, and Current 
Protocols in Molecular Biology, Ausubel et al. Due to the inherent degeneracy of the genetic 
code, other DNA sequences which encode substantially the same or a functionally equivalent 
amino acid sequence may be used in the practice of the invention for the cloning and expression 
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' of these novel nucleic acids. Such DNA sequences include those which are capable of 
hybridizing to the appropriate novel nucleic acid sequence under stringent conditions, 

Polynucleotides encoding preferred polypeptide truncations of the invention can be used 
to generate polynucleotides encoding chimeric or fusion proteins comprising one or more 
5 domains of the invention and heterologous protein sequences. 

y*JXJ UUWiWUUUWj Ui UXO XXI V CXXULUi-1 UViVlXUVSXXUXXJ 1XXVX UUO UXO Wl/XX-ipxCrXXXCXXL \JX OUjf Ul UIG 

polynucleotides recited above. The polynucleotide can be DNA (genomic, cDNA, amplified, or 
synthetic) or RNA. Methods and algorithms for obtaining such polynucleotides are well known 
to those of skill in the art and can include, for example, methods for determining hybridization 

1 0 conditions that can routinely isolate polynucleotides of the desired sequence identities. 

In accordance with the invention, polynucleotide sequences comprising the mature 
protein coding sequences corresponding to any one of SEQ ID NO: 1-1350, or functional 
equivalents thereof, may be used to generate recombinant DNA molecules that direct the 
expression of that nucleic acid, or a functional equivalent thereof, in appropriate host cells. xAJso 

1 5 included are the cDNA inserts of any of the clones identified herein. 

A polynucleotide according to the invention can be joined to any of a variety of other 
nucleotide sequences by well-established recombinant DNA techniques (see Sambrook J et al. 
(1989) Molecular Cloning: A Laboratory Manual, Cold Spring Harbor Laboratory, NY). Useful 
nucleotide sequences for joining to polynucleotides include an assortment of vectors, e.g., 

20 plasmids, cosmids, lambda phage derivatives, phagemids, and the like, that are well known in the 
art. Accordingly, the invention also provides a vector including a polynucleotide of the 
invention and a host cell containing the polynucleotide. In general, the vector contains an origin 
of replication functional in at least one organism, convenient restriction endonuclease sites, and a 
selectable marker for the host cell. Vectors according to the invention include expression 

25 vectors, replication vectors, probe generation vectors, and sequencing vectors. A host cell 
according to the invention can be a prokaryotic or eukaryotic cell and can be a unicellular 
organism or part of a multicellular organism. 

The present invention further provides recombinant constructs comprising a nucleic acid 
having any of the nucleotide sequences of SEQ ID NO:1-1350 or a fragment thereof or any other 

30 polynucleotides of the invention. In one embodiment, the recombinant constructs of the present 
invention comprise a vector, such as a plasmid or viral vector, into which a nucleic acid having 
any of the nucleotide sequences of SEQ ID NO:1-1350 or a fragment thereof is inserted, in a 
forward or reverse orientation. In the case of a vector comprising one of the ORFs of the'present 
invention, the vector may further comprise regulatory sequences, including for example, a 

35 promoter, operably linked to the ORF. Large numbers of suitable vectors and promoters are 
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known to those of skill in the art and are commercially available for generating the recombinant 
constructs of the present invention. The following vectors are provided by way of example. 
Bacterial: pBs, phagescript, PsiX174, pBhiescript SK, pBs KS, pNH8a, pNH16a 3 pNH18a, 
pNH46a (Stratagene); pTrc99A, pKK223-3, pKK233-3, pDR540, pRIT5 (Pharmacia). 
5 Eukaryotic: pWLneo, pSV2cat, pOG44, PXTI, pSG (Stratagene) pSVK3, pBPV, pMSG, pSVL 
(Pharmacia). 

The isolated polynucleotide of the invention may be operably linked to an expression 
control sequence such as the pMT2 or pED expression vectors disclosed in Kaufman et al, 
Nucleic Acids Res. 19, 4485-4490 (1991), in order to produce the protein recombinantly. Many 

10 suitable expression control sequences are known in the art. General methods of expressing 
recombinant proteins are also known and are exemplified in R. Kaufman, Methods in 
Enzymology 1 85, 537-566 (1990). As defined herein "operably linked" means that the isolated 
polynucleotide of the invention and an expression control sequence are situated within a vector 
or cell in such a way that the protein is expressed by a host cell which has been transformed 

15 (transfected) with the ligated polynucleotide/expression control sequence. 

Promoter regions can be selected from any desired gene using CAT (chloramphenicol 
transferase) vectors or other vectors with selectable markers. Two appropriate vectors are 
pKK232-8 and pCM7. Particular named bacterial promoters include lad, lacZ, T3, T7, gpt, 
lambda PR, and trc. Eukaryotic promoters include CMV immediate early, HSV thymidine 

20 kinase, early and late SV40, LTRs from retrovirus, and mouse metallothionein-I. Selection of 
the appropriate vector and promoter is well within the level of ordinary skill in the art. 
Generally, recombinant expression vectors will include origins of replication and selectable 
markers permitting transformation of the host cell, e.g., the arnpicillin resistance gene of E. coli 
and S. cerevisiae TRP1 gene, and a promoter derived from a highly-expressed gene to direct 

25 transcription of a downstream structural sequence. Such promoters can be derived from operons 
encoding glycolytic enzymes such as 3-phosphoglycerate kinase (PGK), a-factor, acid 
phosphatase, or heat shock proteins, among others. The heterologous structural sequence is 
assembled in appropriate phase with translation initiation and termination sequences, and 
preferably, a leader sequence capable of directing secretion of translated protein into the 

30 periplasmic space or extracellular medium. Optionally, the heterologous sequence can encode a 
fusion protein including an amino terminal identification peptide imparting desired 
characteristics, e.g. , stabilization or simplified purification of expressed recombinant product. 
Useful expression vectors for bacterial use are constructed by inserting a structural DNA 
sequence encoding a desired protein together with suitable translation initiation and termination 

35 signals in operable reading phase with a functional promoter. The vector will comprise one or 
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morephenotypic selectable markers and an origin of replication to ensure maintenance of the 
vector and to,if desirable, provide amplification within the host Suitable prokaryotic hosts for 
transformation include E. coli, Bacillus subtilis, Salmonella typhimurium and various species 
withinthe gnmPseudomonas, Streptomyces, and although others may also be 

5 employed as a matter of choice. 

As a representative but non-limitine examole. usefiil 

-'■•»- — — • *»w».t/ A >> xsjl uasjiQLiai uSc 

can comprise a selectable marker and bacterial origin of replication derived from commercially 
available plasmids comprising genetic elements of the well known cloning vector pBR322 
(ATCC 37017). Such commercial vectors include, for example, pKK223-3 (Pharmacia Fine 
10 Chemicals, Uppsala, Sweden) and GEM I (Promega Biotech, Madison, WI, USA). TCese 
PBR322 "backbone" sections are combined with an appropriate promoter and the structural 
sequence to be expressed. Following transformation of a suitablehost strain and growth of me 
host strain to an appropriate cell density, the selected promoter is induced or depressed by 
appropriate means (e.g. , temperature shift or chemical induction) and cells are cultured for an 
15 additional period. Cells are typically harvested b^^^ 

chemical means, and the resulting crude extract retained for mrther purification. . 

Polynucleotide of the invention can also be used to mduce iinmune responses. For 
example, as described in Fan et «L. Nat. Biotech. 17:870-872 (1999), incorporated herein by 
reference, nucleic acid sequences encoding a polypeptide may be used to generate antibodies 
20 against the encoded polypeptide following topical adniinistration of naked plasmid DNA or 
following injection, and preferably intramuscular injection of the DNA. Tne nucleic acid 
sequences are preferably inserted in a recombinant expression vector and may be in the form of 
naked DNA. 

25 4.3 ANTISENSE 

Another aspect of the invention pertains to isolated antisense nucleic acid molecules that 
are hybridizable to or complementary to the nucleic acid molecule comprising the nucleotide 
sequence of SEQ ID NCH-ISSO, or tmgments, analogs or derivatives thereof. An "antisense" 
nucleic acid comprises a nucleotide sequence mat is complementary to a "sense" nucleic acid 
encodingaprotein, e.g., complementary to nie coding strand of a double-stmndedcDNA 
molecule or complementary toanraRNA sequence. In specific aspects, antisense nucleic acid 
molecules are provided that comprise a sequence complementary to at least aboutlO.25,50 
1 00, 250 or 500 nucleotides or an entire coding strand, or to only a portion thereof. Nucleic 'acid 
molecules encoding fragments, homologs, derivatives and analogs of aprotein of any of SEQ ID 
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NO: 135 1-2700 or antisense nucleic acids complementary to a nucleic acid sequence of SEQ ID 
NO:1-1350 are additionally provided. 

In one embodiment, an antisense nucleic acid molecule is antisense to a "coding region" 
of the coding strand of a nucleotide sequence of the invention. The term "coding region" refers 
5 to the region of Ihe nucleotide sequence comprising codons which are translated into amino acid 
residues. In another embodiment, the antisense nucleic acid molecule is antisense to a 
"noncoding region" of the coding strand of a nucleotide sequence of the invention. The term 
"noncoding region" refers to 5' and 3' sequences which flank the coding region that are not 
translated into amino acids (i.e., also referred to as 5' and 3' untranslated regions). 
10 Given the coding strand sequences encoding a nucleic acid disclosed herein (e.g. , SEQ 

ID NO:1-1350), antisense nucleic acids of the invention can be designed according to the rules 
of Watson and Crick or Hoogsteen base pairing. The antisense nucleic acid molecule can be 
complementary to the entire coding region of a mRNA, but more preferably is an oligonucleotide 
mat is antisense to only a portion of the coding or noncoding region of a mRNA. For example, 
15 the antisense oUgonucleotide can be complementary to the region surrounding the translation 
start site of a mRNA. An antisense oligonucleotide can be, for example, about 5, 10, 15, 20, 25, 
30, 35, 40, 45 or 50 nucleotides in length. An antisense nucleic acid of the invention can be 
constructed using chemical synthesis or enzymatic ligation reactions using procedures known in 
the art For example, an antisense nucleic acid (e.g., an antisense oligonucleotide) can be 
20 chemically synthesized using naturally occurring nucleotides or variously modified nucleotides 
designed to increase the biological stability of the molecules or to increase the physical stability 
of the duplex formed between the antisense and sense nucleic acids, e.g. , phosphorothioate 
derivatives and acridine substituted nucleotides can be used. 

Examples of modified nucleotides that can be used to generate the antisense nucleic acid 
25 include: 5-fluorouracU, 5-bromouracil, 5-chlorouracil, 5-iodouracil, hypoxanthine, xanthine, 
4-acetylcytosine, 5-(carboxyhydroxylmethyl) uracil, 5-carboxymethylaminomethyl- 
2-thiouridine, 5-carboxymemylaminomethyluracil, dihydrouracil,beta-D-galactosylqueosine, 
inosine, N6-isopentenyladenine, 1-methylguanine, 1-metoyhnosine, 2,2-dimethylguanine, 
2-methyladenine, 2-methylguanine, 3-methylcytosine, 5-methylcytosine, N6-adenine, 
30 7-methylguanine, 5-memylaminomethyluracU, 5-memoxyanimomethyl-2-thiouracil, 
beta-D-mannosylqueosine, 5'-methoxycarboxymethyluracil, 5-methoxyuracil, 
2-memylthio-N6-isopentenyladenine, uracil-5-oxyacetic acid (v), wybutoxosine, pseudouracil, 
queosine, 2-thiocytosine, 5-memy W-Mouracil, 2-thiouracil, 4-thiouracil, 5-methyluracil, 
uracil-5-oxyacetic acid methylester, uracil-5-oxyacetic acid (v), 5-methyl-2-tbiouracil, 
35 3-(3-amino-3-N-2-carboxypropyl) uracil, (acp3)w, and 2,6-diaminopurine. Alternatively, the 
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antisense nucleic acid can be produced biologically using an expression vector into which a 
nucleic acid has been subcloned in an antisense orientation (i e. , RNA transcribed from the 
inserted nucleic acid will be of an antisense orientation to a target nucleic acid of interest, 
described further in the following subsection). 
5 The antisense nucleic acid molecules of the invention are typically administered to a 

subject or generated in situ such that they hybridize with or bind to cellular mKNA and% 
genomic DNA encoding a protein according to the invention to thereby inhibit expression of the 
protein, e.g. , by inhibiting transcription and/or translation. The hybridization can be by 
conventional nucleotide complementarity to form a stable duplex, or, for example, in the case of 
1 0 an antisense nucleic acid molecule that binds to DNA duplexes, through specific interactions in 
the major groove of the double helix. An example of a route of administration of antisense 
nucleic acid molecules of the invention includes direct injection at a tissue site. Alternatively, 
antisense nucleic acid molecules can be modified to target selected cells and then administered 
systemically. For example, for systemic admmistration, antisense molecules can be modified 
1 5 such that they specifically bind to receptors or antigens expressed on a selected cell surface, e.g. , 
by unking the antisense nucleic acid molecules to peptides or antibodies that bind to cell surface 
receptors or antigens. The antisense nucleic acid molecules can also be delivered to cells using 
the vectors described herein. To achieve sufficient intraceUular concentrations of antisense 
molecules, vector constructs in which the antisense nucleic acid molecule is placed under the 
20 control of a strong pol II or pol III promoter are preferred. 

In yet another embodiment, the antisense nucleic acid molecule of the invention is an 
-a n omeric nucleic acid molecule. An -a nomeric nucleic acid molecule forms specific 
double-stranded hybrids with complementary RNA in which, contrary to the usual -units, the 
strands run parallel to each other (Gaultier et al. (1987) Nucleic Acids Res 15: 6625-6641). The 
25 antisense nucleic acid molecule can also comprise a 2'-o-me%lribonucleotide (Inoue et al 

(1987) Nucleic Acids Res 15: 6131-6148) or a chimeric RNA -DNA analogue (Inoue et al (1987) 
FEBSLett2l5: 327-330). 



4.4 RIBOZYMES AND PNA MOIETIES 

In still another embodiment, an antisense nucleic acid of the invention is a ribozyme. 
Ribozymes are catalytic RNA molecules with ribonuclease activity that are capable of cleaving a 
single-stranded nucleic acid, such as a mRNA, to which they have a complementary region. 
Thus, ribozymes (e.g., hammerhead ribozymes (described in Haselhoffand Gerlach (1988) 
Nature 334:585-591)) can be used to catalytically cleave a mRNA transcripts to thereby inhibit 
translation of a mRNA. A ribozyme having specificity for a nucleic acid of the invention can be 
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designed based upon the nucleotide sequence of a DNA disclosed herein (/. e. , SEQ ID NO: 1 - 
1350). For example, a derivative of a Tetrahymena L-19 IVS RNA can be constructed in which 
the nucleotide sequence of the active site is complementary to the nucleotide sequence to be 
cleaved in a SECX-encoding mRNA. See, e.g, Cech et al U.S. Pat. No. 4,987,071 ; and Cech et 
5 al U.S. Pat No. 5,1 16,742. Alternatively, SECX niRNA can be used to select a catalytic RNA 
having a specific ribonuclease activity from a pool of RNA molecules. See,e.g.,Bartelef a/., 
(1993) Science 261:1411-1418. 

Alternatively, gene expression can be inhibited by targeting nucleotide sequences 
complementary to the regulatory region (e.g., promoter and/or enhancers) to form triple helical 
10 " structures that prevent transcription of the gene in target cells. See generally, Helene. (1991) 
Anticancer Drug Des. 6: 569-84; Helene. etal (1992) Ann. N.Y. Acad. Set 660:27-36; and 
Maher (1992) Bioassays 14: 807-15. 

In various embodiments, the nucleic acids of the invention can be modified at the base 
moiety, sugar moiety or phosphate backbone to improve, e.g , the stability, hybridization, or 
15 solubility of the molecule. For example, the deoxyribose phosphate backbone of the nucleic 
acids can be modified to generate peptide nucleic acids (see Hyrup et al (1996) BioorgMed 
Chem 4: 5-23). As used herein, the terms M peP tide nucleic acids" or "FNAs" refer to nucleic acid 
mimics, e.g , DNA mimics, in which the deoxyribose phosphate backbone is replaced by a 
pseudopeptide backbone and only the four natural nucleobases are retained. The neutral 
20 backbone of PNAs has been shown to allow for specific hybridization to DNA and RNA under 
conditions of low ionic strength. The synthesis of PNA oligomers can be performed using 
standard solid phase peptide synthesis protocols as described in Hyrup et al (1996) above; 
Peny-OKeefe etal (1996) PNAS 93: 14670-675. 

PNAs of the invention can be used in therapeutic and diagnostic applications. For 
25 example, PNAs can be used as antisense or antigene agents for sequence-specific modulation of 
gene expression by, e.g, inducing transcription or translation arrest or inhibiting replication. 
PNAs of the invention can also be used, e.g, in the analysis of single base pair mutations in a 
gene by, e.g, PNA directed PCR clamping; as artificial restriction enzymes when used in 
combination with other enzymes, e.g, SI nucleases (Hyrup B. (1996) above); or as probes or 
30 primers for DNA sequence and hybridization (Hyrup et al (1996), above; Perry-O'Keefe (1996), 
above). 

In another embodiment, PNAs of the invention can be modified, e.g. , to enhance their 
stability or cellular uptake, by attaching lipophilic or other helper groups to PNA, by the 
formation of PNA-DNA chimeras, or by the use of liposomes or other techniques of drug 
35 delivery known in the art. For example, PNA-DNA chimeras can be generated that may 
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combine the advantageous properties ofPNA and DNA. Such chimeras allow DNA recognition 
enzymes, e.g., RNase H and DNA polymerases, to interact with the DNA portion while the PNA 
portion would provide high binding affinity and specificity. PNA-DNA chimeras can be linked 
using linkers of appropriate lengths selected in terns of base stacking, number of bonds between 
5 the nucleobases, and orientation (Hyrup (1996) above). The synthesis of PNA-DNA chimeras 
can be performed as described in Hyrup (1 996) above and Finn et aL (1996) Nuc! Acids Res 24: 
3357-63. For example, a DNA chain can be synthesized on a solid support using standard 
phosphoramidite coupling chemistry, and modified nucleoside analogs, e.g. , 
5H4-methoxytrityl)amino-5'-deoxy-thymidine phosphoramidite, can be used between the PNA 
10 and the 5' end of DNA (Mag etal. (1989) Nucl Acid Res 17: 5973-88). PNA monomers are then 
coupled in a stepwise manner to produce a chimeric molecule with a 5' PNA segment and a 3' 
DNA segment (Finn et al. (1996) above). Alternatively, chimeric molecules can be synthesized 
with a 5' DNA segment and a 3' PNA segment See, Petersen et at. (1975) BioorgMed Chem 
Lett 5:1119-11124. 

15 In other embodiments, the oligonucleotide may include other appended groups such as 

peptides (e.g., for targeting host cell receptors in vivo), or agents facilitating transport across the 
cell membrane (see, e.g., Letsingere/a/., 1989,^. NatL Acad Sci. U.S.A. 86:6553-6556; 
Lemaitre et al., 1987, Proc. Natl. Acad. Sci. 84:648-652; PCT Publication No. W088/0981O) or 
the blood-brain barrier (see, e.g. , PCT Publication No. W089/1 01 34). In addition, 

20 oligonucleotides can be modified wilh hybridization triggered cleavage agents (See, e.g., Krol et 
aL, 1988, BioTechniques 6:958-976) or intercalating agents. (See, e.g., Zon, 1988, Pharm. Res. 
5: 539-549). To this end, the oligonucleotide may be conjugated to another molecule, e.g., a 
peptide, a hybridization triggered cross-linking agent, a transport agent, a hybridization-triggered 
cleavage agent, etc. 

15 

4.S HOSTS 

The present invention further provides host cells genetically engineered to contain the 
polynucleotides of the invention. For example, such host cells may contain nucleic acids of the 
invention introduced into the host cell using known transformation, transfection or infection 
0 methods. The present invention still further provides host cells genetically engineered to express 
the polynucleotides of the invention, wherein such polynucleotides are in operative association 
with a regulatory sequence heterologous to the host cell which drives expression of the 
polynucleotides in the cell. 

Knowledge of nucleic acid sequences allows for modification of cells to permit, or 
5 increase, expression of endogenous polypeptide. Cells can be modified (e.g. , by homologous 
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recombination) to provide increased polypeptide expression by replacing, in whole or in part, the 
naturally occurring promoter with all or part of a heterologous promoter so that the cells express 
the polypeptide at higher levels. The heterologous promoter is inserted in such a manner that it 
is operativeiy linked to 1he encoding sequences. See, for example, PCT International Publication 

5 No. WO94/12650, PCT International Publication No. WO92/20808, and PCT International 

Publication No. WO91/09955. It is also contemplated that, in addition to heterologous promoter 
DNA, amplifiable marker DNA (e.#, ada, dhfr, and the multifunctional CAD gene which 
encodes carbamyl phosphate synthase, aspartate transcarbamylase, and dihydroorotase) and/or 
intron DNA may be inserted along with the heterologous promoter DNA. If linked to the coding 

1 0 sequence, amplification of the marker DNA by standard selection methods results in co- 
amplification of the desired protein coding sequences in the cells. 

The host cell can be a higher eukaryotic host cell, such as a mammalian cell, a lower 
eukaryotic host cell, such as a yeast cell, or the host cell can be a prokaryotic cell, such as a 
bacterial cell. Introduction of the recombinant construct into the host cell can be effected by 

1 5 calcium phosphate transfection, DEAE, dextran mediated transfection, or electroporation (Davis, 
L. et al., Basic Methods in Molecular Biology (1986)). The host cells containing one of the 
polynucleotides of the invention, can be used in conventional manners to produce the gene 
product encoded by the isolated fragment (in the case of an ORF) or can be used to produce a 
heterologous protein under the control of the EMF. 

20 Any host/vector system can be used to express one or more of the ORFs of the present 

invention. These include, but are not limited to, eukaryotic hosts such as HeLa cells, Cv-1 cell, 
COS cells, 293 cells, and S£9 cells, as well as prokaryotic host such as E. coli and B. subtilis. 
The most preferred cells are those which do not normally express the particular polypeptide or 
protein or which expresses the polypeptide or protein at low natural level. Mature proteins can 

25 be expressed in mammalian cells, yeast, bacteria, or other cells under the control of appropriate 
promoters. Cell-free translation systems can also be employed to produce such proteins using 
KNAs derived from the DNA constructs of the present invention. Appropriate cloning and 
expression vectors for use with prokaiyotic and eukaryotic hosts are described by Sambrook, et 
al. 5 in Molecular Cloning: A Laboratory Manual, Second Edition, Cold Spring Harbor, New 

30 York (1989), the disclosure of which is hereby incorporated by reference. 

Various mammalian cell culture systems can also be employed to express recombinant 
protein. Examples of mammalian expression systems include the COS-7 lines of monkey kidney 
fibroblasts, described by Gluzman, Ceil 23:175 (1981). Other cell lines capable of expressing a 
compatible vector are, for example, the C127, monkey COS cells, Chinese Hamster Ovary 
35 (CHO) cells, human kidney 293 cells, human epidermal A431 cells, human Colo205 cells, 3T3 
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cells, CV-1 cells, other transformed primate cell lines, normal diploid cells, cell strains derived 
from in vitro culture of primary tissue, primary explants, HeLa cells, mouse L cells, BHK, 
HL-60, U937, HaK or Jurkat cells. Mammalian expression vectors will comprise an origin of 
replication, a suitable promoter and also any necessary ribosome binding sites, polyadenyiatipn 
5 site, splice donor and acceptor sites, transcriptional termination sequences, and 5' flanking 
nontranscribed sequences. DNA sequences derived from the SV40 viral gmm% for example, 
SV40 origin, early promoter, enhancer, splice, and polyadeny lation sites may be used to provide 
the required nontranscribed genetic elements. Recombinant polypeptides and proteins produced 
in bacterial culture are usually isolated by initial extraction from cell pellets, followed by one or 

10 more salting-out, aqueous ion exchange or size exclusion chromatography steps. Protein 
refolding steps can be used, as necessary, in completing configuration of the mature protein. 
Finally, high performance liquid chromatography (HPLC) can be employed for final purification 
steps. Microbial cells employed in expression of proteins can be disrupted by any convenient 
method, including freeze-thaw cycling, sonication, mechanical disruption, or use of cell lysing 

15 agents. 

Alternatively, it may be possible to produce the protein in lower eukaryotes such as yeast 
or insects or in prokaiyotes such as bacteria. Potentially suitable yeast strains include 
Saccharomyces cerevisiae, Schizosaccharomyces pombe, Kluyveromyces stains, Candida, or 
any yeast strain capable of expressing heterologous proteins. Potentially suitable bacterial 
strains include Escherichia coli, Bacillus subtilis, Salmonella typhimurium, or any bacterial 
strain capable of expressing heterologous proteins. If the protein is made in yeast or bacteria, it 
may be necessary to modify the protein produced therein, for example by phosphorylation or 
glycosylate of the appropriate sites, in order to obtain the functional protein. Such covalent 
attachments may be accomplished using known chemical or enzymatic methods. 

In another embodiment of the present invention, cells and tissues may be engineered to 
express an endogenous gene comprising the polynucleotides of the invention under the control of 
inducible regulatory elements, in which case the regulatory sequences of the endogenous gene 
may be replaced by homologous recombination. As described herein, gene targeting can be used 
to replace a gene's existing regulatory region with a regulatory sequence isolated from a different 
gene or a novel regulatory sequence synthesized by genetic engineering methods. Such 
regulatory sequences may be comprised of prompters, enhancers, scaffold-attachment regions, 
negative regulatory elements, transcriptional initiation sites, regulatory protein binding sites or 
combinations of said sequences. Alternatively, sequences which affect the structure or stability 
of the RNA or protein produced may be replaced, removed, added, or otherwise modified by 
targeting. These sequence include polyadenylation signals, mRNA stability elements, splice 
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sites, leader sequences for enhancing or modifying transport or secretion properties of the 
protein, or other sequences which alter or improve the function or stability of protein or RNA 
molecules. 

The targeting event may be a simple insertion of the regulatory sequence, placing the 

5 gene under the control of the new regulatory sequence, e.g., inserting a new promoter or 

enhancer or both upstream of a gene. Alternatively, the targeting event may be a simple deletion 
of a regulatory element, such as the deletion of a tissue-specific negative regulatory element. 
Alternatively, the targeting event may replace an existing element; for example, a tissue-specific 
enhancer can be replaced by an enhancer that has broader or different cell-type specificity than 

10 the naturally occurring elements. Here, the naturally occurring sequences are deleted and new 
sequences are added. In all cases, the identification of the targeting event may be facilitated by 
the use of one or more selectable marker genes that are contiguous with the targeting DNA, 
allowing for the selection of cells in which the exogenous DNA has integrated into the host cell 
genome. The identification of the targeting event may also be facilitated by the use of one or 

15 more marker genes exhibiting the property of negative selection, such that the negatively 
selectable marker is linked to the exogenous DNA, but configured such that the negatively 
selectable marker flanks the targeting sequence, and such that a correct homologous 
recombination event with sequences in the host cell genome does not result in the stable 
integration of the negatively selectable marker. Markers useful for this purpose include the 

20 Herpes Simplex Virus thymidine kinase (TK) gene or the bacterial xanthine-guanine 
phosphoribosyi-transferase (gpt) gene. 

The gene targeting or gene activation techniques which can be used in accordance with 
this aspect of the invention are more particularly described in U.S. Patent No. 5,272,071 to 
Chappel; U.S. Patent No. 5,578,461 to Sherwin et al.; International Application No. 

25 PCT/US92/09627 (WO93/09222) by Selden et al. ; and International Application No. 

PCT/US90/06436 (WO91/06667) by Skouhchi et al., each of which is incorporated by reference 
herein in its entirety. 

4.6 POLYPEPTIDES OF THE INVENTION 

30 The isolated polypeptides of the invention include, but are not limited to, a polypeptide 

comprising: the amino acid sequences set forth as any one of SEQ ID NO: 1 351-2700 or an 
amino acid sequence encoded by any one of the nucleotide sequences SEQ ID NO; 1-1 350 or the 
corresponding full length or mature protein. Polypeptides of the invention also include 
polypeptides preferably with biological or immunological activity that are encoded by: (a) a 

35 polynucleotide having any one of the nucleotide sequences set forth in SEQ ID NO: 1-1350 or (b) 
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polynucleotides encoding any one of the amino acid sequences set forth as SEQ ID NO:1351- 
2700 or (c) polynucleotides that hybridize to the complement of the polynucleotides of either (a) 
or (b) under stringent hybridization conditions. The invention also provides biologically active 
or immunologically active variants of any of the amino acid sequences set forth as SEQ ID 
5 NO:1351-2700 or the corresponding MI length or mature protein; and "substantial equivalents" 
thereof (e.g., with at least about 65%, at least about 70%. at least about 75% ; at least about 80%, 
at least about 85%, 86%, 87%, 88%, 89%, at least about 90%, 91%, 92%, 93%, 94%, typically at 
least about 95%, 96%, 97%, more typically at least about 98%, or most typically at least about 
99% amino acid identity) that retain biological activity. Polypeptides encoded by allelic variants 
1 0 may have a similar, increased, or decreased activity compared to polypeptides comprising SEQ 
IDNO:135l-2700. 

Fragments of the proteins of the present invention which are capable of exhibiting 
biological activity are also encompassed by the present invention. Fragments of the protein may 
be in linear fonn or they may be cyclized using known methods, for example, as described in H. 
15 U. Saragovi, et aL, Bio/Technology 10, 773-778 (1992) and inR. S. McDowell, et al. s J. Amcr. 
Chem. Soc. 1 14, 9245-9253 (1992), both of which are incorporated herein by reference. Such 
fragments may be fused to carrier molecules such as immunoglobulins for many purposes, 
including increasing the valency of protein binding sites. 

The present invention also provides both full-length and mature forms (for example, 
20 without a signal sequence or precursor sequence) of the disclosed proteins. The protein coding 
sequence is identified in the sequence listing by translation of the disclosed nucleotide 
sequences. The mature form of such protein may be obtained by expression of a full-length 
polynucleotide in a suitable mammalian cell or other host cell. The sequence of the mature form 
of the protein is also determinable from the amino acid sequence of the full-length form. Where 
25 proteins of the present invention are membrane bound, soluble forms of the proteins are also 
provided. In such forms, part or all of the regions causing the proteins to be membrane bound 
are deleted so that the proteins are fully secreted from the cell in which they are expressed. 

Protein compositions of the present invention may further comprise an acceptable carrier, 
such as ahydrophilic, e.g., pharmaceutical^ acceptable, carrier. 
30 The present invention further provides isolated polypeptides encoded by the nucleic acid 

fragments of the present invention or by degenerate variants of the nucleic acid fragments of the 
present invention. By "degenerate variant" is intended nucleotide fragments which differ from a 
nucleic acid fragment of the present invention (e.g., an ORF) by nucleotide sequence but, due to 
the degeneracy of the genetic code, encode an identical polypeptide sequence. Preferred nucleic 
35 acid fragments of the present invention are the ORFs that encode proteins. 
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A variety of methodologies known in the art can be utilized to obtain any one of the 
isolated polypeptides or proteins of the present invention. At the simplest level, the amino acid 
sequence can be synthesized using commercially available peptide synthesizers. The 
synthetically-constructed protein sequences, by virtue of sharing primary, secondary or tertiary 

5 structural and/or conformational characteristics with proteins may possess biological properties 
in common therewith, including protein activity. This technique is particularly useful in 
producing small peptides and fragments of larger polypeptides. Fragments are useful, for 
example, in generating antibodies against the native polypeptide. Thus, they may be employed 
as biologically active or immunological substitutes for natural, purified proteins in screening of 

1 0 therapeutic compounds and in immunological processes for the development of antibodies. 

The polypeptides and proteins of the present invention can alternatively be purified from 
cells which have been altered to express the desired polypeptide or protein. As usfed herein, a 
cell is said to be altered to express a desired polypeptide or protein when the cell, through genetic 
manipulation, is made to produce a polypeptide or protein which it normally does not produce or' 

15 which the cell normally produces at a lower level. One skilled in the art can readily adapt 
procedures for introducing and expressing either recombinant or synthetic sequences into 
eukaryotic or prokaryotic cells in order to generate a cell which produces one of the polypeptides 
or proteins of the present invention. 

The invention also relates to methods for producing a polypeptide comprising growing a 

20 culture of host cells of the invention in a suitable culture medium, and purifying the protein from 
the cells or the culture in which the cells are grown. For example, the methods of the invention 
include a process for producing a polypeptide in which a host cell containing a suitable 
expression vector that includes a polynucleotide of the invention is cultured under conditions that 
allow expression of the encoded polypeptide. The polypeptide can be recovered from the 

25 culture, conveniently from the culture medium, or from a lysate prepared from the host cells and 
further purified. Preferred embodiments include those in which the protein produced by such 
process is a full length or mature form of the protein. 

In an alternative method, the polypeptide or protein is purified from bacterial cells which 
naturally produce the polypeptide or protein. One skilled in the art can readily follow known 

30 methods for isolating polypeptides and proteins in order to obtain one of the isolated 
polypeptides or proteins of the present invention. These include, but are not limited to, 
immunochromatography, HPLC, size-exclusion chromatography, ion-exchange chromatography, 
and immuno-affinity chromatography. See, e.g., Scopes, Protein Purification: Principles and 
Practice, Springer-Verlag (1994); Sambrook, etal., in Molecular Cloning: A Laboratory 

35 Manual; Ausubel et al., Current Protocols in Molecular Biology. Polypeptide fragments that 
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retain biological/immunological activity include fragments comprising greater than about 100 
amino acids, or greater than about 200 amino acids, and fragments that encode specific protein 
domains. 

The purified polypeptides can be used in in vitro binding assays which are well known in 
5 the art to identify molecules which bind to the polypeptides. These molecules include but are not 
limited to, for e.g., small molecules, molecules from combinatorial libraries, antibodies or other 
proteins. The molecules identified in the binding assay are then tested for antagonist or agonist 
activity in in vivo tissue culture or animal models that are well known in the art. In brief, the 
molecules are titrated into a plurality of cell cultures or animals and then tested for either 
10 cell/animal death or prolonged survival of the animal/cells. 

In addition, the peptides of the invention or molecules capable of binding to the peptides 
may be complexed with toxins, e.g., rich or cholera, or with other compounds that are toxic to 
cells. The toxin-binding molecule complex is then targeted to a tumor or other cell by the 
specificity of the binding molecule for SEQ ID NO:1351-2700. 
15 The protein of the invention may also be expressed as a product of transgenic animals, 

e.g., as a component of the milk of transgenic cows, goats, pigs, or sheep which are characterized 
by somatic or germ cells containing a nucleotide sequence encoding the protein. 

The proteins provided herein also include proteins characterized by amino acid sequences 
similar to those of purified proteins but into which modification are naturally provided or 
20 deliberately engineered. For example, modifications, in the peptide or DNA sequence, can be 
made by those skilled in the art using known techniques. Modifications of interest in the protein 
sequences may include the alteration, substitution, replacement, insertion or deletion of a 
selected amino acid residue in the coding sequence. For example, one or more of the cysteine 
residues may be deleted or replaced with another amino acid to alter the conformation of the 
!5 molecule. Techniques for such alteration, substitution, replacement, insertion or deletion are 
well known to those skilled in the art (see, e.g., U.S. Pat. No. 4,518,584). Preferably, such 
alteration, substitution, replacement, insertion or deletion retains the desired activity of the 
protein. Regions of the protein that are important for the protein function can be determined by 
various methods known in the art including the alanine-scanning method which involved 
0 systematic substitution of single or strings of amino acids with alanine, followed by testing the 
resulting alanine-containing variant for biological activity. This type of analysis determines the 
importance of the substituted amino acid(s) in biological activity. Regions of the protein that are 
important for protein function may be determined by the eMATRIX program. 

Other fragments and derivatives of the sequences of proteins which would be expected to 
5 retain protein activity in whole or in part and are useful for screening or other immunological 
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methodologies may also be easily made by those skilled in the art given the disclosures herein. 
Such modifications are encompassed by the present invention. 

The protein may also be produced by operably linking the isolated polynucleotide of the 
invention to suitable control sequences in one or more insect expression vectors, and employing 

5 an insect expression system. Materials and methods for baculovirus/insect cell expression 
systems are commercially available in kit form from* e.g., Invitrogen, San Diego, Calif., U.S.A. 
(the MaxBat™ kit), and such methods are well known in the art, as described in Summers and 
Smith, Texas Agricultural Experiment Station Bulletin No. 1555 (1987), incorporated herein by 
reference. As used herein, an insect cell capable of expressing a polynucleotide of the present 

1 0 invention is "transformed. M 

The protein of the invention may be prepared by culturing transformed host cells under 
culture conditions suitable to express the recombinant protein. The resulting expressed protein 
may then be purified from such culture (/.*?., from culture medium or cell extracts) using known 
purification processes, such as gel filtration and ion exchange chromatography. The purification 

15 of the protein may also include an affinity column containing agents which will bind to the 
protein; one or more column steps over such affinity resins as concanavalin A-agarose, 
heparin-toyopearl™ or Cibacrom blue 3GA Sepharose™; one or more steps involving 
hydrophobic interaction chromatography using such resins as phenyl ether, butyl ether, or propyl 
ether; or immunoaffinity chromatography. 

20 Alternatively, the protein of the invention may also be expressed in a form which will 

facilitate purification. For example, it may be expressed as a fusion protein, such as those of . 
maltose binding protein (MBP), glutathione-S-transferase (GST) or thioredoxin (TRX), or as a 
His tag. Kits for expression and purification of such fusion proteins are commercially available 
from New England BioLab (Beverly, Mass.), Pharmacia (Piscataway, N.J.) and Invitrogen, 

25 respectively. The protein can also be tagged with an epitope and subsequently purified by using 
a specific antibody directed to such epitope. One such epitope ("FLAG®") is commercially 
available from Kodak (New Haven, Conn.). 

Finally, one or more reverse-phase high performance liquid chromatography (RP- HPLC) 
steps employing hydrophobic RP-HPLC media, e.g., silica gel having pendant methyl or other 

30 aliphatic groups, can be employed to further purify the protein. Some or all of the foregoing 
purification steps, in various combinations, can also be employed to provide a substantially 
homogeneous isolated recombinant protein. The protein thus purified is substantially free of 
other mammalian proteins and is defined in accordance with the present invention as an "isolated 
protein." 
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The polypeptides of the invention include analogs (variants). This embraces fragments, 
as well as peptides in which one or more amino acids has been deleted, inserted, or substituted. 
Also, analogs of the polypeptides of the invention embrace fusions of the polypeptides or 
modifications of the polypeptides of the invention, wherein the polypeptide or analog is fused to 
5 another moiety or moieties, e.g., targeting moiety or another therapeutic agent. Such analogs 
may exhibit improved properties such as activity and/or stability. Examples of moieties which 
may be fused to the polypeptide or an analog include, for example, targeting moieties which 
provide for the delivery of polypeptide to pancreatic cells, e.g., antibodies to pancreatic cells, 
antibodies to immune cells such as T-celis, monocytes, dendritic cells, granulocytes, etc., as well 
1 0 as receptor and ligands expressed on pancreatic or immune cells. Other moieties which may be 
fused to the polypeptide include therapeutic agents which are used for treatment, for example, 
immunosuppressive drugs such as cyclosporin, SK506, azathioprine, CD3 antibodies and 
steroids. Also, polypeptides may be fused to immune modulators, and other cytokines such as 
alpha or beta interferon. 

15 

4.6.1 DETERMINING POLYPEPTIDE AND POLYNUCLEOTIDE IDENTITY 
AND SIMILARITY 

Preferred identity and/or similarity are designed to give the largest match between the 
sequences tested. Methods to determine identity and similarity are codified in computer 
20 programs including, but are not limited to, the GCG program package, including GAP 
(Devereux, J., et al., Nucleic Acids Research 12(1):387 (1984); Genetics Computer Group, 
University of Wisconsin, Madison, WI), BLASTP, BLASTN, BLASTX, FASTA (Altschul, S.R 
et al., J. Molec. Biol. 215:403-410 (1990), PSI-BLAST (Altschul S.F. et al., Nucleic Acids Res. 
vol. 25, pp. 3389-3402, herein incorporated by reference), eMatrix software (Wu et al., J. Comp. 
Biol., Vol. 6, pp. 219-235 (1999), herein incorporated by reference), eMotif software (Nevill- 
Manning et al, ISMB-97, Vol. 4, pp. 202-209, herein incorporated by reference), pFam software 
(Sonnhammer et al., Nucleic Acids Res., Vol. 26(1), pp. 320-322 (1998), herein incorporated by 
reference) and the Kyte-Doolittle hydrophobocity prediction algorithm (J. Mol Biol, 157, pp. 
105-3 1 (1 982), incorporated herein by reference). The BLAST programs are publicly available 
from the National Center for Biotechnology Information (NCBI) and other sources (BLAST 
Manual, Altschul, S., et al. NCB NLM NIH Bethesda, MD 20894; Altschul, S., et al., J. Mol. 
Biol. 215:403-410 (1990). 

4.7 CHIMERIC AND FUSION PROTEINS 

The invention also provides chimeric or fusion proteins. As used herein, a "chimeric 
protein" or "fusion protein" comprises a polypeptide of the invention operatively linked to 
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another polypeptide. Within a fusion protein the polypeptide according to the invention can 
correspond to all or a portion of a protein according to the invention. In one embodiment, a 
fusion protein comprises at least one biologically active portion of a protein according to the 
invention. In another embodiment, a fusion protein comprises at least two biologically active 
5 portions of a protein according to the invention. Within the fusion protein, the term "operatively 
linked" is intended to indicate that the polypeptide according to the invention and the other * 
polypeptide are fused in-frame to each other. The polypeptide can be fused to the N-terminus or 
C-terminus. 

For example, in one embodiment a fusion protein comprises a polypeptide according to 

10 the invention operably linked to the extracellular domain of a second protein. 

In another embodiment, the fusion protein is a GST-fusion protein in which the 
polypeptide sequences of the invention arc fused to the C-terminus of the GST (i.e., glutathione 
S-transferase) sequences. 

In another embodiment, the fusion protein is an immunoglobulin fusion protein in which 

15 the polypeptide sequences according to the invention comprises one or more domains are fused 
to sequences derived from a member of the immunoglobulin protein family. The 
immunoglobulin fusion proteins of the invention can be incorporated into pharmaceutical 
compositions and administered to a subject to inhibit an interaction between a ligand and a 
protein of the invention on the surface of a cell, to thereby suppress signal transduction in vivo. 

20 The immunoglobulin fusion proteins can be used to affect the bioavailability of a cognate ligand. 
Inhibition of the ligand/protein interaction may be useful therapeutically for both the treatment of 
proliferative and differentiative disorders, e,g., cancer as well as modulating (e.g., promoting or 
inhibiting) cell survival. Moreover, the immunoglobulin fusion proteins of the invention can be 
used as immunogens to produce antibodies in a subject, to purify ligands, and in screening assays 

25 to identify molecules that inhibit the interaction of a polypeptide of the invention with a ligand. 

A chimeric or fusion protein of the invention can be produced by standard recombinant 
DNA techniques. For example, DNA fragments coding for the different polypeptide sequences 
are ligated together in-frame in accordance with conventional techniques, e.g., by employing 
blunt-ended or stagger-ended termini for ligation, restriction enzyme digestion to provide for 

30 appropriate termini, filling-in of cohesive ends as appropriate, alkaline phosphatase treatment to 
avoid undesirable joining, and enzymatic ligation. In another embodiment, the fusion gene can 
be synthesized by conventional techniques including automated DNA synthesizers. 
Alternatively, PCR amplification of gene fragments can be carried out using anchor primers that 
give rise to complementary overhangs between two consecutive gene fragments that can 

3 5 subsequently be annealed and reamplified to generate a chimeric gene sequence (see, for 
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example, Ausubel et al. (eds.) Current Protocols jn Molecular Biology, John Wiley & 
Sons, 1992). Moreover, many expression vectors are commercially available that already encode 
a fasion moiety {e.g., a GST polypeptide). A nucleic acid encoding a polypeptide of the 
invention can be cloned into such an expression vector such that the fusion moiety is linked 
5 in-frame to the protein of the invention. 

4.8 GENE THERAPY 

Mutations in the polynucleotides of the invention gene may result in loss of normal 
function of the encoded protein. The invention thus provides gene therapy to restore normal 

10 activity of the polypeptides of the invention; or to treat disease states involving polypeptides of 
the invention. Delivery of a functional gene encoding polypeptides of the invention to 
appropriate cells is effected ex vivo, in situ, or in vivo by use of vectors, and more particularly 
viral vectors (e.g. , adenovirus, adeno-associated virus, or a retrovirus), or ex vivo by use of 
physical DNA transfer methods (e.g., liposomes or chemical treatments). See, for example, 

15 Anderson, Nature, supplement to vol. 392, no. 6679, pp.25-20 (1998). For additional reviews of 
gene therapy technology see Friedmann, Science, 244: 1275-1281 (1989); Verma, Scientific 
American: 68-84 (1990); and Miller, Nature, 357: 455460 (1992). Introduction of any one of 
the nucleotides of the present invention or a gene encoding the polypeptides of the present 
invention can also be accomplished with extrachromosomal substrates (transient expression) or 

20 artificial chromosomes (stable expression). Cells may also be cultured ex vivo in the presence of 
proteins of the present invention in order to proliferate or to produce a desired effect on or 
activity in such cells. Treated cells can then be introduced in vivo for therapeutic purposes. 
Alternatively, it is contemplated that in other human disease states, preventing the expression of 
or inhibiting the activity of polypeptides of the invention will be useful in treating the disease 

25 states. It is contemplated that antisense therapy or gene therapy could be applied to negatively 
regulate the expression of polypeptides of the invention. 

Other methods inhibiting expression of a protein include the introduction of antisense 
molecules to the nucleic acids of the present invention, their complements, or their translated RNA 
sequences, by methods known in the art. Further, the polypeptides of the present invention can be 

30 inhibited by using targeted deletionmethods, or the insertion of a negative regulatory element such 
as a silencer, which is tissue specific. 

The present invention still further provides cells genetically engineered in vivo to express the 
polynucleotides of the invention, wherein such polynucleotides are in operative association with a 
regulatory sequence heterologous to the host cell which drives expression of the polynucleotides in 



34 




WO 01/57188 PCT/US01/03800 

the cell. These methods can be used to increase or decrease the expression of the polynucleotides of 
the present invention. 

Knowledge of DNA sequences provided by the invention allows for modification of cells to 
permit, increase, or decrease, expression of endogenous polypeptide. Cells can be modified {e.g. , by 

5 homologousrecombination)to provide increasedpolypeptide expression by replacing, in whole or 
in part, the naturally occurring promoter with all or part of a heterologous promoter so that the cells 
express the protein at higher levels. The heterologous promoter is inserted in such a manner that it is 
operatively linked to the desired protein encoding sequences. See, for example, PCT International 
Publication^. WO 94/1 2650, PCT International PublicationNo. WO 92/20808, and PCT 

10 International PublicationNo. WO 91/09955. It is also contemplated that, in addition to heterologous 
promoter DNA, amplifiable marker DNA {e.g., ada, dhfir, and the multifunctional CAD gene which 
encodes carbamyl phosphate synthase, aspartate Iran scarbamylase, and dihydroorolase) andfor 
intron DNA may be inserted along with the heterologous promoter DNA. If linked to the desired 
protein coding sequence, amplificationof the marker DNA by standard selectionmethods results in 

1 5 co-amplification of die desired protein coding sequences in the cells. 

In another embodiment of the present invention, cells and tissues may be engineered to 
express an endogenous gene comprising the polynucleotides of the invention under the control of 
inducible regulatory elements, in which case the regulatory sequences of the endogenous gene may 
be replaced by homologous recombination. As described herein, gene targeting can be used to 

20 replacea gene's existing regulatory region with a regulatory sequence isolated ft^ 

or a novel regulatory sequence synthesizedby genetic engineering methods. Such regulatory 
sequences may be comprised of promoters, enhancers, scafibld-attachmentregions, negative 
regulatory elements, transcriptional initiation sites, regulatory protein binding sites or combinations 
of said sequences. Alternatively, sequences which affect the structure or stability of the RNA or 

25 protein produced may be replaced, removed, added, or otherwise modifiedby targeting. These 
sequences include poiyadenylationsignals, mRNA stability elements, splice sites, leader sequences 
for enhancing or modifying transport or secretionproperties of the protein, or other sequences 
which alter or improve the function or stability of protein or RNA molecules. 

The targeting event may be a simple insertion ofthe regulatory sequence, placing the gene 

30 under the control ofthe new regulatory sequence, e.g., inserting a new promoter or enhancer or botti 
upstream of a gene. Alternatively, the targeting event may be a simple deletion of a regulatory 
element, such as the deletion of a tissue-specific negative regulatory element. Alternatively, the 
targeting event may replace an existing element; for example, a tissue-specific enhancer can be 
replaced by an enhancerthat has broader or different cell-type specificity than the naturally 
35 occuning elements. Here, the naturally occurring sequences are deleted and new sequences are 
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added. In all cases, the identification of the targeting event may be facilitatedby the use of one or 
more selectable marker genes that are contiguous with the targeting DNA, allowing for the selection 
of cells in which the exogenousDNA has integrated into the cell genome. The identification of the 
targeting event may also be facilitatedby the use of one or more marker genes exhibiting the 
5 property of negative selection, such that the negatively selectablemarker is linked to the exogenous 
DNA, but configured such that the negatively selectable marker flanks the targeting sequence, and 
such that a correct homologous recombination event with sequences in the host cell genome does 
not result in the stable integration of the negatively selectable marker. Markers useful for this 
purpose include the Herpes Simplex Virus thymidine kinase (TK) gene or the bacterial 

10 xanthine-guanine phosphoribosyl-transf erase (gpt)gene. 

Hie gene targeting or gene activation techniques which can be used in accordance with this 
aspect of the invention are more particularly described in U.S. Patent No. 5,272,071 to Chappel; 
- US. Patent No. 5,578,461 to Sherwin et al.; International Application No. PCT/US92/09627 
(WO93/09222) by Selden et al.; and International ApplicationNo. PCT/US90/06436 

15 (W09 1/06667) by Skoultchi et al., each of which is incorporated by reference herein in its entirety. 

4.9 TRANSGENIC ANIMALS 

In preferred methods to determine biological functions of the polypeptides of the 
invention in vivo, one or more genes provided by the invention are either over expressed or 

20 inactivated in the germ line of animals using homologous recombination [Capecchi, Science 
244:1288-1292 (1989)]. Animals in which the gene is over expressed, under the regulatory 
control of exogenous or endogenous promoter elements, are known as transgenic animals. 
Animals in which an endogenous gene has been inactivated by homologous recombination are 
referred to as "knockout" animals. Knockout animals, preferably non-human mammals, can be 

25 prepared as described in U.S. Patent No. 5,557,032, incorporated herein by reference. Transgenic 
animals are useful to determine the roles polypeptides of the invention play in biological 
processes, and preferably in disease states. Transgenic animals are useful as model systems to 
identify compounds that modulate lipid metabolism. Transgenic animals, preferably non-human 
mammals, are produced using methods as described in U.S. Patent No 5,489,743 and PCT 

30 Publication No. W094/28122, incorporated herein by reference. 

Transgenic animals can be prepared wherein all or part of a promoter of the 
polynucleotides of the invention is either activated or inactivated to alter the level of expression 
of the polypeptides of the invention. Inactivation can be carried out using homologous 
recombination methods described above. Activation can be achieved by supplementing or even 

35 replacing the homologous promoter to provide for increased protein expression. The homologous 
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promoter can be supplemented by insertion of one or more heterologous enhancer elements 
known to confer promoter activation in a particular tissue. 

The polynucleotides of the present invention also make possible the development, 
through, e.g., homologous recombination or knock out strategies, of animals that fail to express 

5 polypeptides of the invention or that express a variant polypeptide. Such animals are useful as 
models for studying the in vivo activities of polypeptide as well as for studying modulators of the 
polypeptides of the invention. 

In preferred methods to determine biological functions of the polypeptides of the 
invention in vivo, one or more genes provided by the invention are either over expressed or 

10 inactivated in the germ line of animals using homologous recombination [Capecchi, Science 
244:1288-1292 (1989)]. Animals in which the gene is over expressed, under the regulatory 
control of exogenous or endogenous promoter elements, are known as transgenic animals. 
Anim als in which an endogenous gene has been inactivated by homologous recombination are 
referred to as "knockout" animals. Knockout animals, preferably non-human mammals, can be 

15 prepared as described in U.S. Patent No. 5,557,032, incorporated herein by reference. Transgenic 
animals are useful to determine the roles polypeptides of the invention play in biological 
processes, and preferably in disease states. Transgenic animals are useful as model systems to 
identify compounds that modulate lipid metabolism. Transgenic animals, preferably non-human 
mammals, are produced using methods as described in U.S. Patent No 5,489,743 and PCT 

20 Publication No. W094/28122, incorporated herein by reference. 

Transgenic animals can be prepared wherein all or part of the polynucleotides of the 
invention promoter is either activated or inactivated to alter the level of expression of the 
polypeptides of the invention. Inactivation can be carried out using homologous recombination 
methods described above. Activation can be achieved by supplementing or even replacing the 

25 homologous promoter to provide for increased protein expression. The homologous promoter 
can be supplemented by insertion of one or more heterologous enhancer elements known to 
confer promoter activation in a particular tissue. 

4.10 USES AND BIOLOGICAL ACTIVITY 

30 The polynucleotides and proteins of the present invention are expected to exhibit one or 

more of the uses or biological activities (including those associated with assays cited herein) 
identified herein. Uses or activities described for proteins of the present invention may be 
provided by administration or use of such proteins or of polynucleotides encoding such proteins 
(such as, for example, in gene therapies or vectors suitable for introduction of DNA). The 

35 mechanism underlying the particular condition or pathology will dictate whether the 
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polypeptides of the invention, the polynucleotides of the invention or modulators (activators or 
inhibitors) thereof would be beneficial to the subject in need of treatment. Thus, "therapeutic 
compositions of the invention" include compositions comprising isolated polynucleotides 
(including recombinant DNA molecules, cloned genes and degenerate variants thereof) or 
5 polypeptides of the invention (including full length protein, mature protein and truncations or 
domains thereof), or compounds and other substances that modulate the overall activity of lb* 
target gene products, either at the level of target gene/protein expression or target protein 
activity. Such modulators include polypeptides, analogs, (variants), including fragments and 
fusion proteins, antibodies and other binding proteins; chemical compounds that directly or 

10 indirectly activate or inhibit the polypeptides of the invention (identified, e.&, via drug screening 
assays as described herein); antisense polynucleotides and polynucleotides suitable for triple 
helix formation; and in particular antibodies or other binding partners that specifically recognize 
one or more epitopes of the polypeptides of the invention. 

The polypeptides of the present invention may likewise be involved in cellular activation 

15 or in one of the other physiological pathways described herein. 

4.10.1 RESEARCH USES AND UTILITIES 

The polynucleotides provided by the present invention can be used by the research 
community for various purposes. The polynucleotides can be used to express recombinant 

20 protein for analysis, characterization or therapeutic use; as markers for tissues in which the 

corresponding protein is preferentially expressed (either constitutively or at a particular stage of 
tissue differentiation or development or in disease states); as molecular weight markers on gels; 
as chromosome markers or tags (when labeled) to identify chromosomes or to map related gene 
positions; to compare with endogenous DNA sequences in patients to identify potential genetic 

25 disorders; as probes to hybridize and thus discover novel, related DNA sequences; as a source of 
information to derive PCR primers for genetic fingerprinting; as a probe to "subtract-out" known 
sequences in the process of discovering other novel polynucleotides; for selecting and making 
oligomers for attachment to a "gene chip" or other support, including for examination of 
expression patterns; to raise anti-protein antibodies using DNA immunization techniques; and as 

30 an antigen to raise anti-DNA antibodies or elicit another immune response. Where the 

polynucleotide encodes a protein which binds or potentially binds to another protein (such as, for 
example, in a receptor-ligand interaction), the polynucleotide can also be used in interaction trap 
assays (such as, for example, that described in Gyuris et al, Cell 75:791-803 (1993)) to identify 
polynucleotides encoding the other protein with which binding occurs or to identify inhibitors of 

35 the binding interaction. 
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The polypeptides provided by the present invention can similarly be used in assays to 
determine biological activity, including in a panel of multiple proteins for high-throughput 
screening; to raise antibodies or to elicit another immune response; as a reagent (including the 
labeled reagent) in assays designed to quantitatively determine levels of the protein (or its 

5 receptor) in biological fluids; as markers for tissues in which the corresponding polypeptide is 
preferentially expressed (either constitutively or at a particular stage of tissue differentiation or 
development or in a disease state); and, of course, to isolate correlative receptors or ligands. 
Proteins involved in these binding interactions can also be used to screen for peptide or small 
molecule inhibitors or agonists of the binding interaction. 

10 Any or all of these research utilities are capable of being developed into reagent grade or 

kit format for commercialization as research products. 

Methods for performing the uses listed above are well known to those skilled in the art 
References disclosing such methods include without limitation "Molecular Cloning: A 
Laboratory Manual", 2d ed., Cold Spring Harbor Laboratory Press, Sambrook, J., E. F. Fritsch 

15 and T. Maniatis eds., 1 989, and "Methods in Enzymology: Guide to Molecular Cloning 
Techniques", Academic Press, Berger,. S. L. and A. R. Kimmel eds., 1 987. 

4,10.2 NUTRITIONAL USES 

Polynucleotides and polypeptides of the present invention can also be used as nutritional 
20 sources or supplements. Such uses include without limitation use as a protein or amino acid 

supplement, use as a carbon source, use as a nitrogen source and use as a source of carbohydrate. In 
such cases the polypeptide or polynucleotide of the invention can be added to the feed of a 
particular organism or can be administered as a separate solid or liquid preparation, such as in the 
form of powder, pills, solutions, suspensions or capsules. In the case of microorganisms, the 
25 polypeptide or polynucleotide of the invention can be added to the medium in or on which the 
microorganism is cultured. 



activity or may induce production of other cytokines in certain cell populations. A 
polynucleotide of the invention can encode a polypeptide exhibiting such attributes. Many 
protein factors discovered to date, including all known cytokines, have exhibited activity in one 
35 or more factor-dependent cell proliferation assays, and hence the assays serve as a convenient 
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4.10.3 CYTOKINE AND CELL PROLIFERATION/DIFFERENTIATION 
ACTIVITY 

A polypeptide of the present invention may exhibit activity relating to cytokine, cell 
proliferation (either inducing or inhibiting) or cell differentiation (either inducing or inhibiting) 
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confirmation of cytokine activity. The activity of therapeutic compositions of the present 
invention is evidenced by any one of a number of routine factor dependent cell proliferation 
assays for cell lines including, without limitation, 32D, DA2, DA1G, T10, B9, B9/1 1, BaF3, 
MC9/G, M+(preB M+), 2E8, RB5, DAI, 123, Tl 165, HT2, CTLL2, TF r l, Mo7e, CMK, 
5 HUVEC, and Caco. Therapeutic compositions of the invention can be used in the following: 
Assays for T-cell or thymocyte proliferation include without limitation those described 
in: Current Protocols in Immunology, Ed by J. E. Coligan, A. M. Kruisbeek, D. H. Maigulies, E. 
M. Shevach, W. Strober, Pub. Greene Publishing Associates and Wiley-Interscience (Chapter 3, 
In Vitro assays for Mouse Lymphocyte Function 3.1-3.19; Chapter 7, Immunologic studies in 

10 Humans); Takai et aL, J. Immunol. 1 37:3494-3500, 1 986; Bertagnolli et aL, J. Immunol. 

145:1706-1712, 1990; Bertagnolli et aL, Cellular Immunology 133:327-341, 1991; Bertagnolli, 
etal,L Immunol. 149:3778-3783, 1992; Bowman etal.,1. Immunol. 152:1756-1761, 1994. 

Assays for cytokine production and/or proliferation of spleen cells, lymph node cells or 
thymocytes include, without limitation, those described in: Polyclonal T cell stimulation, 

15 Kruisbeek, A. M. and Shevach, E. M. In Current Protocols in Immunology. J. E. e.a. Coligan 
eds. Vol 1 pp. 3.12.1-3.12.14, John Wiley and Sons, Toronto. 1994; and Measurement of mouse 
and human interieukin-y, Schreiber, R.D. In Current Protocols in Immunology. J. E. e.a, Coligan 
eds. Vol 1 pp. 6.8.1-6.8.8, John Wiley and Sons, Toronto. 1994. 

Assays for proliferation and differentiation of hematopoietic and lymphopoietic cells 

20 include, without limitation, those described in: Measurement of Human and Murine Interleukin 2 
and Interleukin 4, Bottomly, K., Davis, L. S. and Lipsky, P. E. In Current Protocols in 
Immunology. J. E. e.a. Coligan eds. Vol 1 pp. 6.3.1-6.3.12, John Wiley and Sons, Toronto. 1991; 
deVries et aL, J. Exp. Med. 173:1205-1211, 1991; Moreauetal., Nature 336:690-692, 1988; 
Greenberger et aL, Proc. Natl. Acad. Sci. U.SA 80:2931-2938, 1983; Measurement of mouse 

25 and human interleukin 6-Nordan, R. In Current Protocols in Immunology. J. E. Coligan eds. Vol 
1 pp. 6.6.1-6.6.5, John Wiley and Sons, Toronto. 1991; Smith et aL, Pioc. Natl. Aced. Sci. 
U.S.A. 83:1857-1861, 1986; Measurement of human Interleukin 11-Bennett, P., Giannotti, J., 
Clark, S. C. and Turner, K. J. In Current Protocols in Immunology. J. E. Coligan eds. Vol 1 pp. 
6.15.1 John Wiley and Sons, Toronto. 1991; Measurement of mouse and human Interleukin 

30 9-Ciarletta, A., Giannotti, J., Clark, S. C. and Turner, K. J. In Current Protocols in Immunology. 
J. E. Coligan eds. Vol 1 pp. 6.13.1, John Wiley and Sons, Toronto. 1991. 

Assays for T-cell clone responses to antigens (which will identify, among others, proteins 
that affect APC-T cell interactions as well as direct T-cell effects by measuring proliferation and 
cytokine production) include, without limitation, those described in: Current Protocols in 

35 Immunology, Ed by J. E. Coligan, A. M. Kruisbeek, D. H. Margulies, E. M. Shevach, W Strober, 
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Pub. Greene Publishing Associates and Wiley-Interscience (Chapter 3, In Vitro assays for Mouse 
Lymphocyte Function; Chapter 6, Cytokines and their cellular receptors; Chapter 7, 
Immunologic studies in Humans); Weinberger et aL, Proc, Natl. Acad. Sci. USA 77:6091-6095, 
1980; Weinberger et aL, Eur. J. Tmmun. 1 1:405-411, 1981; Takai et aL, J. Immunol. 
5 137:3494-3500, 1986; Takai et aL, J. Immunol. 140:508-512, 1988. 

4.10.4 STEM CELL GROWTH FACTOR ACTIVITY 

A polypeptide of the present invention may exhibit stem cell growth factor activity and 
be involved in the proliferation, differentiation and survival of pluripotent and totipotent stem 

1 0 cells including primordial germ cells, embryonic stem cells, hematopoietic stem cells and/or 
germ line stem cells. Administration of the polypeptide of the invention to stem cells in vivo or 
ex vivo is expected to maintain and expand cell populations in a totipotential or pluripotential 
state which would be useful for re-engineering damaged or diseased tissues, transplantation, 
manufacture of bio-pharmaceuticals and the development of bio-sensors. The ability to produce 

1 5 large quantities of human cells has important working applications for the production of human 
proteins which currently must be obtained from non-human sources or donors, implantation of 
cells to treat diseases such as Parkinson's, Alzheimer's and other neurodegenerative diseases; 
tissues for grafting such as bone marrow, skin, cartilage, tendons, bone, muscle (including 
cardiac muscle), blood vessels, cornea, neural cells, gastrointestinal cells and others; and organs 

20 for transplantation such as kidney, liver, pancreas (including islet cells), heart and lung. 

It is contemplated that multiple different exogenous growth factors and/or cytokines may 
be administered in combination with the polypeptide of the invention to achieve the desired 
effect, including any of the growth factors listed herein, other stem cell maintenance factors, and 
specifically including stem cell factor (SCF), leukemia inhibitory factor (LIF), Flt-3 ligand (Flt- 

25 3L), any of the interleukins, recombinant soluble IL-6 receptor fused to IL-6, macrophage 

inflammatory protein 1-alpha (MUM -alpha), G-CSF, GM-CSF, thrombopoietin (TPO), platelet 
factor 4 (PF-4), platelet-derived growth factor (PDGF), neural growth factors and basic 
fibroblast growth factor (bFGF). 

Since totipotent stem cells can give rise to virtually any mature cell type, expansion of 

30 these cells in culture will facilitate the production of large quantities of mature cells. Techniques 
for culturing stem cells are known in the art and administration of polypeptides of the invention, 
optionally with other growth factors and/or cytokines, is expected to enhance the survival and 
proliferation of the stem cell populations. This can be accomplished by direct administration of 
the polypeptide of the invention to the culture medium. Alternatively, stroma cells transfected 

35 with a polynucleotide that encodes for the polypeptide of the invention can be used as a feeder 
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layer for the stem cell populations in culture or in vivo. Stromal support cells for feeder layers 
may include embryonic bone marrow fibroblasts, bone marrow stromal cells, fetal liver cells, or 
cultured embryonic fibroblasts (see U.S. Patent No. 5,690,926), 

Stem cells themselves can be transfected with a polynucleotide of the invention to induce 
5 autocrine expression of the polypeptide of the invention. This will allow for generation of 
undifferentiated totipotential/pluripotential stem cell lines that are useful as is or that can then be 
differentiated into the desired mature cell types. These stable cell lines can also serve as a source 
of undifferentiated totipotential/pluripotential mRNA to create cDNA libraries and templates for 
polymerase chain reaction experiments. These studies would allow for the isolation and 

1 0 identification of differentially expressed genes in stem cell populations that regulate stem cell 
proliferation and/or maintenance. 

Expansion and maintenance of totipotent stem cell populations will be useful in the 
treatment of many pathological conditions. For example, polypeptides of the present invention 
may be used to manipulate stem cells in culture to give rise to neuroepithelial cells that can be 

1 5 used to augment or replace cells damaged by illness, autoimmune disease, accidental damage or 
genetic disorders. The polypeptide of the invention may be useful for inducing the proliferation „ . 
of neural cells and for the regeneration of nerve and brain tissue, /. e. for the treatment of central 
and peripheral nervous system diseases and neuropathies, as well as mechanical and traumatic 
disorders which involve degeneration, death or trauma to neural cells or nerve tissue. In addition, 

20 the expanded stem cell populations can also be genetically altered for gene therapy purposes and 
to decrease host rejection of replacement tissues after grafting or implantation. 

Expression of the polypeptide of the invention and its effect on stem cells can also be 
manipulated to achieve controlled differentiation of the stem cells into more differentiated cell 
types. A broadly applicable method of obtaining pure populations of a specific differentiated 

25 cell type from undifferentiated stem cell populations involves the use of a cell-type specific 

promoter driving a selectable marker. The selectable marker allows only cells of the desired type 
to survive. For example, stem cells can be induced to differentiate into cardiomyocytes (Wobus 
et al., Differentiation, 48: 173-1 82, (1991); Klug et al., J. Clin. Invest, 98(1): 216-224, (1998)) 
or skeletal muscle cells (Browder, L. W. In: Principles of Tissue Engineering eds. Lanza et al., 

30 Academic Press (1997)). Alternatively, directed differentiation of stem cells can be 

accomplished by culturing the stem cells in the presence of a differentiation factor such as 
retinoic acid and an antagonist of the polypeptide of the invention which would inhibit the 
effects of endogenous stem cell factor activity and allow differentiation to proceed. 

In vitro cultures of stem cells can be used to determine if the polypeptide of the invention 

35 exhibits stem cell growth factor activity. Stem cells are isolated from any one of various cell 
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sources (including hematopoietic stem cells and embryonic stem cells) and cultured on a feeder 
layer, as described by Thompson et al. Proc. Natl. Acad. Sci, U.S.A, 92: 7844-7848 (1995), in 
the presence of the polypeptide of the invention alone or in combination with other growth 
factors or cytokines. The ability of the polypeptide of the invention to induce stem cells 
5 proliferation is determined by colony formation on semi-solid support e.g. as described by 
Bernstein et al., Blood, 77: 2316-2321 (1991). 

4.10.5 HEMATOPOIESIS REGULATING ACTIVITY 

A polypeptide of the present invention may be involved in regulation of hematopoiesis 

10 and, consequently, in the treatment of myeloid or lymphoid cell disorders. Even marginal 

biological activity in support of colony forming cells or of factor-dependent cell lines indicates 
involvement in regulating hematopoiesis, e.g. in supporting the growth and proliferation of 
erythroid progenitor cells alone or in combination with other cytokines, thereby indicating 
utility, for example, in treating various anemias or for use in conjunction with 

15 irradiation/chemotherapy to stimulate the production of erythroid precursors and/or eiythroid 
cells; in supporting the growth and proliferation of myeloid cells such as granulocytes and 
monocytes/macrophages (/.(?., traditional CSF activity) useful, for example, in conjunction with 
chemotherapy to prevent or treat consequent myelo-suppression; in supporting the growth and 
proliferation of megakaryocytes and consequently of platelets thereby allowing prevention or 

20 treatment of various platelet disorders such as thrombocytopenia, and generally for use in place 
of or complimentary to platelet transfusions; andA>r in supporting the growth and proliferation of 
hematopoietic stem cells which are capable of maturing to any and all of the above-mentioned 
hematopoietic cells and therefore find therapeutic utility in various stem cell disorders (such as 
those usually treated with transplantation, including, without limitation, aplastic anemia and 

25 paroxysmal nocturnal hemoglobinuria), as well as in repopulating the stem cell compartment 
post irradiation/chemotherapy, either in-vivo or ex-vivo (i.e., in conjunction with bone marrow 
transplantation or with peripheral progenitor cell transplantation (homologous or heterologous)) . 
as normal cells or genetically manipulated for gene therapy. 

Therapeutic compositions of the invention can be used in the following: 

30 Suitable assays for proliferation and differentiation of various hematopoietic lines are 

cited above. 

Assays for embryonic stem cell differentiation (which will identify, among others, 
proteins that influence embryonic differentiation hematopoiesis) include, without limitation, 
those described in: Johansson et al. Cellular Biology 15:141-151, 1995; Keller et al., Molecular 
35 and Cellular Biology 13:473-486, 1993; McClanahan et al., Blood 81 :2903-2915, 1993. 
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Assays for stem cell survival and differentiation (which will identify, among others, 
proteins that regulate lympho-hematopoiesis) include, without limitation, those described in: 
Methylcellulose colony forming assays, Freshney, M. G. In Culture of Hematopoietic Cells. R. I. 
Freshney, et al. eds. Vol pp. 265-268, Wiley-Liss, Inc., New York, N.Y. 1994; Hirayama et al., 
5 Proc. Natl. Acad. Sci. USA 89:5907-5911, 1992; Primitive hematopoietic colony fonning cells 
with high proliferative potential, McNiece, I. K. and Briddell, R. A. In Culture of Hematopoietic 
Cells. R. I. Freshney, et al. eds. Vol pp. 23-39, Wiley-Liss, Inc., New York, N.Y. 1994; Neben et 
al., Experimental Hematology 22:353-359, 1994; Cobblestone area forming cell assay, 
Ploemacher, R. E. In Culture of Hematopoietic Cells. R. I. Freshney, et al. eds. Vol pp. 1-21, 
10 Wiley-Liss, Inc., New York, N.Y. 1994; Long tenn bone marrow cultures in the presence of 
stromal cells, Spooncer, E., Dexter, M, and Alien, T. In Culture of Hematopoietic Cells. R. I. 
Freshney, etal. eds. Vol pp. 163-179, Wiley-Liss, Inc., New York, N.Y. 1994; Long term culture 
initiating cell assay, Sutherland, Hi J. In Culture of Hematopoietic Cells. R. I. Freshney, et al. 
eds. Vol pp. 139-162, Wiley-Liss, Inc., New York, N.Y. 1994. 

15 

4.10.6 TISSUE GROWTH ACTIVITY 

A polypeptide of the present invention also may be involved in bone, cartilage, tendon, 
ligament and/or nerve tissue growth or regeneration, as well as in wound healing and tissue 
repair and replacement, and in healing of burns, incisions and ulcers. 

20 A polypeptide of the present invention which induces cartilage and/or bone growth in 

circumstances where bone is not nonnally formed, has application in the healing of bone 
fractures and cartilage damage or defects in humans and other animals. Compositions of a 
polypeptide, antibody, binding partner, or other modulator of the invention may have 
prophylactic use in closed as well as open fracture reduction and also in the improved fixation of 

25 artificial joints. De novo bone formation induced by an osteogenic agent contributes to the 
repair of congenital, trauma induced, or oncologic resection induced craniofacial defects, and 
also is useful in cosmetic plastic surgery. 

A polypeptide of this invention may also be involved in attracting bone-forming cells, 
stimulating growth of bone-forming cells, or inducing differentiation of progenitors of 

30 bone-forming cells. Treatment of osteoporosis, osteoarthritis, bone degenerative disorders, or 
periodontal disease, such as through stimulation of bone and/or cartilage repair or by blocking 
inflammation or processes of tissue destruction (collagenase activity, osteoclast activity, etc.) 
mediated by inflammatory processes may also be possible using the composition of the 
invention. 
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Another category of tissue regeneration activity that may involve the polypeptide of the 
present invention is tendon/ligament formation. Induction of tendon/ligament-like tissue or 
other tissue formation in circumstances where such tissue is not normally formed, has 
application in the healing of tendon or ligament tears, deformities and other tendon or ligament 

5 defects in humans and other animals. Such a preparation employing a tendon/ligament-like 
tissue inducing protein may have prophylactic use in preventing damage to tendon or ligament 
tissue, as well as use in the improved fixation of tendon or ligament to bone or other tissues, and 
in repairing defects to tendon or ligament tissue. De novo tendon/ligament-like tissue formation 
induced by a composition of the present invention contributes to the repair of congenital, trauma 

1 0 induced, or other tendon or ligament defects of other origin, and is also useful in cosmetic plastic 
surgery for attachment or repair of tendons or ligaments. The compositions of the present 
invention may provide environment to attract tendon- or ligament-forming cells, stimulate 
growth of tendon- or hgament-forming cells, induce differentiation of progenitors of tendon- or 
ligament-forming cells, or induce growth of tendon/ligament cells or progenitors ex vivo for 

1 5 return in vivo to effect tissue repair. The compositions of the invention may also be useful in the 
treatment of tendinitis, carpal tunnel syndrome and other tendon or ligament defects. The 
compositions may also include an appropriate matrix and/or sequestering agent as a carrier as is 
well known in the art. 

The compositions of the present invention may also be useful for proliferation of neural 

20 cells and for regeneration of nerve and brain tissue, t e. for the treatment of central and peripheral 
nervous system diseases and neuropathies, as well as mechanical and traumatic disorders, which 
involve degeneration, death or trauma to neural cells or nerve tissue. More specifically, a 
composition may be used in the treatment of diseases of the peripheral nervous system, such as 
peripheral nerve injuries, peripheral neuropathy and localized neuropathies, and central nervous 

25 system diseases, such as Alzheimer's, Parkinson's disease, Huntington's disease, amyotrophic 
lateral sclerosis, and Shy-Drager syndrome. Further conditions which may be treated in 
accordance with the present invention include mechanical and traumatic disorders, such as spinal 
cord disorders, head trauma and cerebrovascular diseases such as stroke. Peripheral neuropathies 
resulting from chemotherapy or other medical therapies may also be treatable using a 

30 composition of the invention. 

Compositions of the invention may also be useful to promote better or faster closure of 
non-healing wounds, including without limitation pressure ulcers, ulcers associated with vascular 
insufficiency, surgical and traumatic wounds, and the like. 

Compositions of the present invention may also be involved in the generation or 

35 regeneration of other tissues, such as organs (including, for example, pancreas, liver, intestine, 
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kidney, skin, endothelium), muscle (smooth, skeletal or cardiac) and vascular (including vascular 
endothelium) tissue, or for promoting the growth of cells comprising such tissues. Part of the 
desired effects may be by inhibition or modulation of fibrotic scarring may allow normal tissue 
to regenerate. A polypeptide of the present invention may also exhibit angiogenic activity. 
5 A composition of the present invention may also be useful for gut protection or 

regeneration and treatment of lung or liver fibrosis, reperfusion injury in various tissues, and 
conditions resulting from systemic cytokine damage. 

A composition of the present invention may also be useful for promoting or inhibiting 
differentiation of tissues described above from precursor tissues or cells; or for inhibiting the 
10 growth of tissues described above. 

Therapeutic compositions of the invention can be used in the following: 
Assays for tissue generation activity include, without limitation, those described in: 
International Patent Publication No. WO95/16035 (bone, cartilage, tendon); International Patent 
Publication No. WO95/05846 (nerve, neuronal); International Patent Publication No. 
WO91/07491 (skin, endothelium). 

Assays for wound healing activity include, without limitation, those described in: Winter, 
Epidermal Wound Healing, pps. 71-112 (Maibach, H. I. and Rovee, D. T., eds.), Year Book 
Medical Publishers, Inc., Chicago, as modified by Eaglstein and Mertz, J. Invest. Dermatol 
71:382-84(1978). 

4.10.7 IMMUNE STIMULATING OR SUPPRESSING ACTIVITY 

A polypeptide of the present invention may also exhibit immune stimulating or immune 
suppressing activity, including without limitation the activities for which assays are described 
herein. A polynucleotide of the invention can encode a polypeptide exhibiting such activities. A 
protein may be useful in the treatment of various immune deficiencies and disorders (including 
severe combined immunodeficiency (SCID)), e.g., in regulating (up or down) growth and 
proliferation of T and/or B lymphocytes, as well as effecting the cytolytic activity ofNK cells 
and other cell populations. These immune deficiencies may be genetic or be caused by viral {e.g., 
HIV) as well as bacterial or fungal infections, or may result from autoimmune disorders. More 
specifically, infectious diseases causes by viral, bacterial, fungal or other infection may be 
treatable using a protein of the present invention, including infections by HIV, hepatitis viruses, 
herpes viruses, mycobacteria, Leishmania spp., malaria spp. and various fungal infections such 
as candidiasis. Of course, in this regard, proteins of the present invention may also be useful 
where a boost to the immune system generally may be desirable, in the treatment of cancer. 
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Autoimmune disorders which may be treated using a protein of the present invention 
include, for example, connective tissue disease, multiple sclerosis, systemic lupus erythematosus, 
rheumatoid arthritis, autoimmune pulmonary inflammation, Guillain-Barre syndrome, 
autoimmune thyroiditis, insulin dependent diabetes mellitis, myasthenia gravis, graft-versus-host 

5 disease and autoimmune inflammatory eye disease. Such a protein (or antagonists thereof, 
including antibodies) of the present invention may also to be useful in the treatment of allergic 
reactions and conditions (e.g., anaphylaxis, serum sickness, drug reactions, food allergies, insect 
venom allergies, mastocytosis, allergic rhinitis, hypersensitivity pneumonitis, urticaria, 
angioedema, eczema, atopic dermatitis, allergic contact dermatitis, erythema multiforme, 

10 Stevens- Johnson syndrome, allergic conjunctivitis, atopic keratoconjunctivitis, venereal 
keratoconjunctivitis, giant papillary conjunctivitis and contact allergies), such as asthma 
(particularly allergic asthma) or other respiratory problems. Other conditions, in which immune 
suppression is desired (including, for example, organ transplantation), may also be treatable 
using a protein (or antagonists thereof) of the present invention. The therapeutic effects of the 

15 polypeptides or antagonists thereof on allergic reactions can be evaluated by in vivo animals 

models such as the cumulative contact enhancement test (Lastbom et aL, Toxicology 125: 59-66, 
1998), skin prick test (Hoffinann et al., Allergy 54: 446-54, 1999), guinea pig skin sensitization 
test (Vohr et aL, Arch. Toxocol. 73: 501-9), and murine local lymph node assay (Kimber et al., 
J. Toxicol. Environ. Health 53: 563-79). 

20 Using the proteins of the invention it may also be possible to modulate immune 

responses, in a number of ways. Down regulation may be in the form of inhibiting or blocking an 
immune response already in progress or may involve preventing the induction of an immune 
response. The functions of activated T cells may be inhibited by suppressing T cell responses or 
by inducing specific tolerance in T cells, or both. Immunosuppression of T cell responses is 

25 generally an active, non-antigen-specific, process which requires continuous exposure of the T 
cells to the suppressive agent Tolerance, which involves inducing non-responsiveness or anergy 
in T cells, is distinguishable from immunosuppression in that it is generally antigen-specific and 
persists after exposure to the tolerizing agent has ceased. Operationally, tolerance can be 
demonstrated by the lack of a T cell response upon reexposure to specific antigen in the absence 

30 of the tolerizing agent. 

Down regulating or preventing one or more antigen functions (including without 
limitation B lymphocyte antigen functions (such as, for example, B7)), e.g., preventing high 
level lymphokine synthesis by activated T cells, will be useful in situations of tissue, skin and 
organ transplantation and in graft-versus-host disease (GVHD). For example, blockage of T cell 

35 function should result in reduced tissue destruction in tissue transplantation. Typically, in tissue 
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transplants, rejection of the transplant is initiated through its recognition as foreign by T cells, 
followed by an immune reaction that destroys the transplant. The administration of a therapeutic 
composition of the invention may prevent cytokine synthesis by immune cells, such as T cells, 
and thus acts as an immunosuppressant. Moreover, a lack of costimulation may also be sufficient 
5 to anergize the T cells, thereby inducing tolerance in a subject. Induction of long-term tolerance 
by B lymphocyte antigen-blocking reagents may avoid the necessity of repeated administration 
of these blocking reagents. To achieve sufficient immunosuppression or tolerance in a subject, it 
may also be necessary to block the function of a combination of B lymphocyte antigens. 
The efficacy of particular therapeutic compositions in preventing organ transplant 
1 0 rejection or GVHD can be assessed using animal models that are predictive of efficacy in 

humans. Examples of appropriate systems which can be used include allogeneic cardiac grafts in 
rats and xenogeneic pancreatic islet cell grafts in mice, both of which have been used to examine 
the immunosuppressive effects of CTLA4Ig fusion proteins in vivo as described in Lenschow et 
aL, Science 257:789-792 (1992) and Turka et al, Proc. Natl. Acad. Sci USA, 89:1 1 102-1 1 105 
1 5 (1992). In addition, murine models of GVHD (see Paul ed., Fundamental Immunology, Raven 
Press, New York, 1989, pp. 846-847) can be used to determine the effect of therapeutic 
compositions of the invention on the development of that disease. 

Blocking antigen function may also be therapeutically useful for treating autoimmune 
diseases. Many autoimmune disorders are the result of inappropriate activation of T cells that are 
20 reactive against self tissue and which promote the production of cytokines and autoantibodies 
involved in the pathology of the diseases. Preventing the activation of autoreactive T cells may 
reduce or eliminate disease symptoms. Administration of reagents which block stimulation oft 
cells can be used to inhibit T cell activation and prevent production of autoantibodies or T 
cell-derived cytokines which may be involved in the disease process. Additionally, blocking 
25 reagents may induce antigen-specific tolerance of autoreactive T cells which could lead to 

long-term relief from the disease. The efficacy of blocking reagents in preventing or alleviating 
autoimmune disorders can be determined using a number of well-characterized animal models of 
human autoimmune diseases. Examples include murine experimental autoimmune encephalitis, 
systemic lupus etythmatosis in MRL/lpr/lpr mice or NZB hybrid mice, murine autoimmune 
30 collagen arthritis, diabetes mellitus in NOD mice and BB rats, and murine experimental 

myasthenia gravis (see Paul ed., Fundamental Immunology, Raven Press, New York, 1989, pp. 
840-856). 

Upregulation of an antigen function (e.g. , a B lymphocyte antigen function), as a means 
of up regulating immune responses, may also be useful in therapy. Upregulation of immune 
35 responses may be in the form of enhancing an existing immune response or eliciting an initial 
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immune response. For example, enhancing an immune response may be useful in cases of viral 
infection, including systemic viral diseases such as influenza, the common cold, and 
encephalitis. 

Alternatively, anti-viral immune responses may be enhanced in an infected patient by 

5 removing T cells from the patient, costimulating the T cells in vitro with viral antigen-pulsed 
APCs either expressing a peptide of the present invention or together with a stimulatory form of 
a soluble peptide of the present invention and reintroducing the in vitro activated T cells into the 
patient. Another method of enhancing anti-viral immune responses would be to isolate infected 
cells from a patient, transfect them with a nucleic acid encoding a protein of the present 

10 invention as described herein such that the cells express all or a portion of the protein on their 
surface, and reintroduce the transfected cells into the patient The infected cells would now be 
capable of delivering a costimulatory signal to, and thereby activate, T cells in vivo. 

A polypeptide of the present invention may provide the necessary stimulation signal to T 
cells to induce a T cell mediated immune response against the transfected tumor cells. In 

15 addition, tumor cells which lack MHC class I or MHC class II molecules, or which fail to 

reexpress sufficient mounts of MHC class I or MHC class II molecules, can be transfected with 
nucleic acid encoding all or a portion of (e,g. 9 a cytoplasmic-domain truncated portion) of an 
MHC class I alpha chain protein and JJ 2 microglobulin protein or an MHC class II alpha chain 
protein and an MHC class II beta chain protein to thereby express MHC class I or MHC class II 

20 proteins on the cell surface. Expression of the appropriate class I or class II MHC in conjunction 
with a peptide having the activity of a B lymphocyte antigen (e.g., B7-1, B7-2, B7-3) induces a T 
cell mediated immune response against the transfected tumor cell. Optionally, a gene encoding 
an antisense construct which blocks expression of an MHC class II associated protein, such as 
the invariant chain, can also be cotransfected with a DNA encoding a peptide having the activity 

25 of a B lymphocyte antigen to promote presentation of tumor associated antigens and induce 

tumor specific immunity. Thus, the induction of a T cell mediated immune response in a human 
subject may be sufficient to overcome tumor-specific tolerance in the subject. 

The activity of a protein of the invention may, among other means, be measured by the 
following methods: 

30 Suitable assays for thymocyte or splenocyte cytotoxicity include, without limitation, 

those described in: Current Protocols in Immunology, Ed by J. E. Coligan, A. M. Kruisbeek, D. 
H. Margulies, E. M. Shevach, W. Strober, Pub. Greene Publishing Associates and 
Wiley-Interscience (Chapter 3, In Vitro assays for Mouse Lymphocyte Function 3.1-3.19; 
Chapter 7, Immunologic studies in Humans); Herrmann et al., Proc. Natl. Acad. Sci. USA 

35 78:2488-2492, 1981; Herrmann etal., J. Immunol. 128:1968-1974, 1982; Handaetal., J. 
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Immunol. 135:1564-1572, 1985; Takai et al, I. Immunol. 137:3494-3500, 1986; Takai et al, J. 
Immunol. 140:508-512, 1 988; Bowman et al, J. Virology 61:1992-1998; BertagnoUi et al., 
Cellular Immunology 133:327-341, 1991; Biown et al., J. Immunol. 153:3079-3092, 1994. 
Assays for T-cell-dependent immunoglobulin responses and isotype switching (which 
5 will identify, among others, proteins that modulate T-cell dependent antibody responses and that 
affect Thl/Th2 profiles) include, without limitation, those described in: Maliszewski, J. 
Immunol. 144:3028-3033, 1990; and Assays for B cell function: In vitro antibody production, 
Mond, J. J. and Brunswick, M. In Current Protocols in Immunology. J. E. e.a. Coligan eds. Vol 1 
pp. 3.8.1-3.8.16, John Wiley and Sons, Toronto. 1994. 
10 Mixed lymphocyte reaction (MLR) assays (which will identify, among others, proteins 

that generate predominantly Thl and CTL responses) include, without limitation, those described 
in: Current Protocols in Immunology, Ed by J. E. Coligan, A. M. Kruisbeek, D. H. Margulies, E. 
M. Shevach, W. Strober, Pub. Greene Publishing Associates and Wiley-Interscience (Chapter 3, 
In Vitro assays for Mouse Lymphocyte Function 3, 1-3.19; Chapter 7, Immunologic studies in 
15 Humans); Takai et al., J. Immunol. 137:3494-3500, 1986; Takai et al., J. Immunol. 140:508-512, 
1988; BertagnoUi etal., J. Immunol. 149:3778-3783, 1992. 

Dendritic cell-dependent assays (which will identify, among others, proteins expressed 
by dendritic cells that activate naive T-cells) include, without limitation, those described in: 
Guery et al, J. Immunol. 134:536-544, 1995; Inaba et al., Journal of Experimental Medicine 
20 173:549-559, 1991; Macatonia etal., Journal of Immunology 154:5071-5079, 1995;Porgadoret 
al. Journal of Experimental Medicine 182:255-260, 1995; Nair et al. Journal of Virology 
67:40624069, 1993; Huang et al. Science 264:961-965, 1994; Macatonia et al. Journal of 
Experimental Medicine 169:1255-1264, 1989; Bhardwaj et al. Journal of Clinical Investigation 
94:797-807, 1994; and Inaba et al, Journal of Experimental Medicine 172:631-640, 1990. 
25 Assays for lymphocyte survival/apoptosis (which will identify, among others, proteins 

that prevent apoptosis after superantigen induction and proteins that regulate lymphocyte 
homeostasis) include, without limitation, those described in: Darzynkiewicz et al. Cytometry 
13:795-808, 1992; Gorczyca et al. Leukemia 7:659-670, 1993; Gorczyca et al. Cancer Research 
53:1945-1951, 1993; Itoh et al. Cell 66:233-243, 1991; Zacharchuk, Journal of Immunology 
30 145:4037-4045, 1990; Zamai et al, Cytometry 14:891-897, 1993; Gorczyca et al. International 
Journal of Oncology 1:639-648, 1992. 

Assays for proteins that influence early steps of T-cell commitment and development 
include, without limitation, those described in: Antica et al. Blood 84:1 1 1-117, 1994; Fine et al. 
Cellular Immunology 155:1 1 1-122, 1994; Galy et al. Blood 85:2770-2778, 1995; Toki et al, 
35 Proc. Nat. Acad Sci. USA 88:7548-7551, 1991. 
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4.10.8 ACTIVIN/INHroiN ACTIVITY 

A polypeptide of the present invention may also exhibit activin- or inhibin-related 
activities. A polynucleotide of the invention may encode a polypeptide exhibiting such 

5 characteristics. Inhibins are characterized by their ability to inhibit the release of follicle 

stimulating hormone (FSH), while activins and are characterized by their ability to stimulate the 
release of follicle stimulating hormone (FSH). Thus, a polypeptide of the present invention, 
alone or in heterodimers with a member of the inhibin family, may be usefid as a contraceptive 
based on the ability of inhibins to decrease fertility in female mammals and decrease 

10 spermatogenesis in male mammals. Administration of sufficient amounts of other inhibins can 
induce infertility in these mammals. Alternatively, the polypeptide of the invention, as a 
homodimer or as a heterodimer with other protein subunits of the inhibin group, may be useful as 
a fertility inducing therapeutic, based upon the ability of activin molecules in stimulating FSH 
release from ceils of the anterior pituitary. See, for example, U.S. Pat No. 4,798,885. A 

15 polypeptide of the invention may also be useful for advancement of the onset of fertility in 

sexually immature mammals, so as to increase the lifetime reproductive performance of domestic 
animals such as, but not limited to, cows, sheep and pigs. 

The activity of a polypeptide of the invention may, among other means, be measured by 
the following methods. 

20 Assays for activin/inhibin activity include, without limitation, those described in: Vale et 

al., Endocrinology 91:562-572, 1972; Ling et al., Nature 321:779-782, 1986; Vale et aL, Nature 
321:776-779, 1986; Mason et al., Nature 3 18:659-663, 1985; Forage et al., Proc. Natl. Acad. Sci. 
USA 83:3091-3095, 1986. 

25 4.10.9 CHEMOTACTIC/CHEMOKINETIC ACTIVITY 

A polypeptide of the present invention may be involved in chemotactic or chemokinetic 
activity for mammalian cells, including, for example, monocytes, fibroblasts, neutrophils, 
T-cells, mast cells, eosinophils, epithelial and/or endothelial cells. A polynucleotide of the 
invention can encode a polypeptide exhibiting such attributes. Chemotactic and chemokinetic 

30 receptor activation can be used to mobilize or attract a desired cell population to a desired site of 
action. Chemotactic or chemokinetic compositions (e.g. proteins, antibodies, binding partners, or 
modulators of the invention) provide particular advantages in treatment of wounds and other 
trauma to tissues, as well as in treatment of localized infections. For example, attraction of 
lymphocytes, monocytes or neutrophils to tumors or sites of infection may result in improved 

35 immune responses against the tumor or infecting agent. 
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A protein or peptide has chemotactic activity for a particular cell population if it can 
stimulate, directly or indirectly, the directed orientation or movement of such cell population. 
Preferably, the protein or peptide has the ability to directly stimulate directed movement of cells. 
Whether a particular protein has chemotactic activity for a population of cells can be readily 
5 determined by employing such protein or peptide in any known assay for cell chemotaxis. 
Therapeutic compositions of the invention can be used in the followine: 
Assays for chemotactic activity (which will identify proteins that induce or prevent 
chemotaxis) consist of assays that measure the ability of a protein to induce the migration of 
cells across a membrane as well as the ability of a protein to induce the adhesion of one cell 
1 0 population to another cell population. Suitable assays for movement and adhesion include, 

without limitation, those described in: Current Protocols in Immunology, Ed by J. E. Coligan, A. 
M. Kruisbeek, D. H. Marguiles, E. M. Shevach, W. Strober, Pub. Greene Publishing Associates 
and Wiley-Interscience (Chapter 6.12, Measurement of alpha and beta Chemokines 
6.12.1-6.12.28; Taub et aL J. Clin. Invest. 95:1370-1376, 1995; Lind et al. APMIS 103:140-146, 
15 1995; Muller et al Eur. J. Immunol 25:1744-1748; Gruber et al. J. of Immunol 152:5860-5867, 
1994; Johnston et al. J. of Immunol. 153:1762-1768, 1994. 

4.10.10 HEMOSTATIC AND THROMBOLYTIC ACTIVITY 

A polypeptide of the invention may also be involved in hemostatis or thrombolysis or 
20 thrombosis. A polynucleotide of the invention can encode a polypeptide exhibiting such 

attributes. Compositions may be useful in treatment of various coagulation disoiders (including 
hereditary disorders, such as hemophilias) or to enhance coagulation and other hemostatic events 
in treating wounds resulting from trauma, surgery or other causes. A composition of the 
invention may also be useful for dissolving or inhibiting formation of thromboses and for 
25 treatment and prevention of conditions resulting therefrom (such as, for example, infarction of 
cardiac and central nervous system vessels (eg., stroke). 

Therapeutic compositions of the invention can be used in the following: 
Assay for hemostatic and thrombolytic activity include, without limitation, those 
described in: Linet et al., J. Clin. Pharmacol. 26:1 31-140, 1986; Burdick et al., Thrombosis Res. 
30 45:413-419, 1987; Humphrey ct al., Fibrinolysis 5:71-79 (1991); Schaub, Prostaglandins 
35:467-474, 1988. 

4.10.11 CANCER DIAGNOSIS AND THERAPY 

Polypeptides of the invention may be involved in cancer cell generation, proliferation or 
35 metastasis, Detection of the presence or amount of polynucleotides or polypeptides of the 
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invention may be useful for the diagnosis and/or prognosis of one or more types of cancer. For 
example, the presence or increased expression of a polynucleotide/polypeptide of the invention 
may indicate a hereditary risk of cancer, a precancerous condition, or an ongoing malignancy. 
Conversely, a defect in the gene or absence of the polypeptide may be associated with a cancer 
5 condition. Identification of single nuclfeotide polymorphisms associated with cancer or a 
predisposition to cancer may also be useful for diagnosis or prognosis. 

Cancer treatments promote tumor regression by inhibiting tumor cell proliferation, 
inhibiting angiogenesis (growth of new blood vessels that is necessary to support tumor growth) 
and/or prohibiting metastasis by reducing tumor cell motility or invasiveness. Therapeutic 
1 0 compositions of the invention may be effective in adult and pediatric oncology including in solid 
phase tumors/malignancies, locally advanced tumors, human soft tissue sarcomas, metastatic 
cancer, including lymphatic metastases, blood cell malignancies including multiple myeloma, 
acute and chronic leukemias, and lymphomas, head and neck cancers including mouth cancer, 
larynx cancer and thyroid cancer, lung cancers including small cell carcinoma and non-small cell 
15 cancers, breast cancers including small cell carcinoma and ductal carcinoma, gastrointestinal 
cancers including esophageal cancer, stomach cancer, colon cancer, colorectal cancer and polyps 
associated with colorectal neoplasia, pancreatic cancers, liver cancer, urologic cancers including 
bladder cancer and prostate cancer, malignancies of the female genital tract including ovarian 
carcinoma, uterine (including endometrial) cancers, and solid tumor in the ovarian follicle, 
20 kidney cancers including renal cell carcinoma, brain cancers including intrinsic brain tumors, 
neuroblastoma, astrocytic brain tumors, gliomas, metastatic tumor cell invasion in the central 
nervous system, bone cancers including osteomas, skin cancers including malignant melanoma, 
tumor progression of human skin keraiinocytes, squamous cell carcinoma, basal cell carcinoma, 
hemangiopericytoma and Karposi's sarcoma. 
25 Polypeptides, polynucleotides, or modulators of polypeptides of the invention (including 

inhibitors and stimulators of the biological activity of the polypeptide of the invention) may be 
administered to treat cancer. Therapeutic compositions can be administered in therapeutically 
effective dosages alone or in combination with adjuvant cancer therapy such as surgery, 
chemotherapy, radiotherapy, thermotherapy, and laser therapy, and may provide a beneficial 
30 effect, e.g. reducing tumor size, slowing rate of tumor growth, inhibiting metastasis, or otherwise 
improving overall clinical condition, without necessarily eradicating the cancer. 

The composition can also be administered in therapeutically effective amounts as a 
portion of an anti-cancer cocktail. An anti-cancer cocktail is a mixture of the polypeptide or 
modulator of the invention with one or more anti-cancer drugs in addition to a pharmaceutically 
35 acceptable carrier for delivery. The use of anti-cancer cocktails as a cancer treatment is routine. 
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Anti-cancer drugs that are well known in the art and can be used as a treatment in combination 
with the polypeptide or modulator of the invention include: Actinomycin D, Aminoglutethimide, 
Asparaginase, Bleomycin, Busulfan, Carboplatin, Carmustine, Chlorambucil, Cisplatin (cis- 
DDP), Cyclophosphamide, Cytarabine HC1 (Cytosine arabinoside), Dacarbazine, Dactinomycin, 
5 Daunorubicin HC1, Doxorubicin HC1, Estramustine phosphate sodium, Etoposide (V 1 6-21 3), 
Floxuridine, 5-Fiuorouracil (5-Fu), Flutamide, Hydroxyurea (hydroxycarbamide) ; Ifosfamide. 
Interferon Alpha~2a, Interferon Alpha-2b, Leuprolide acetate (LHRH-releasing factor analog), 
Lomustine, Meehloiethamine HC1 (nitrogen'mustard), Melphalan, Mercaptopurine, Mesna, 
Methotrexate (MIX), Mitomycin, Mitoxantrone HC1, Octreotide, Plicamycin, Procarbazine HC1, 

10 Streptozocin, Tamoxifen citrate, Thioguanine, Thiotepa, Vinblastine sulfate, Vincristine sulfate, 
Amsacrine, Azacitidine, Hexamethylmelamine, Interleukin-2, Mitoguazone, Pentostatin, 
Semustine, Teniposide, and Vindesine sulfate. 

In addition, therapeutic compositions of the invention may be used for prophylactic 
treatment of cancer. There are hereditary conditions and/or environmental situations (e.g. 

15 exposure to carcinogens) known in the art that predispose an individual to developing cancers. 
Under these circumstances, it may be beneficial to treat these individuals with therapeutically - 
effective doses of the polypeptide of the invention to reduce the risk of developing cancers. 

In vitro models can be used to determine the effective doses of the polypeptide of the 
invention as a potential cancer treatment. These in vitro models include proliferation assays of 

20 cultured tumor cells, growth of cultured tumor cells in soft agar (see Freshney, ( 1 987) Culture of 
Animal Cells: A Manual of Basic Technique, Wily-Liss, New York, NY Ch 18 and Ch 21), 
tumor systems in nude mice as described in Giovanella et al., J. Natl. Can. Inst, 52: 921-30 
(1974), mobility and invasive potential of tumor cells in Boyden Chamber assays as described in 
Pilkington et al., Anticancer Res,, 17: 4107-9 (1997), and angiogenesis assays such as induction 

25 of vascularization of the chick chorioallantoic membrane or induction of vascular endothelial 
cell migration as described in Ribatta et al., Intl. J. Dev. Biol, 40: 1 189-97 (1999) and Li et al., 
Clin. Exp. Metastasis, 17:423-9 (1999), respectively. Suitable tumor cells lines are available, 
e.g. from American Type Tissue Culture Collection catalogs. 

30 4,10.12 RECEPTOR/LIGAND ACTIVITY 

A polypeptide of the present invention may also demonstrate activity as receptor, 
receptor ligand or inhibitor or agonist of receptor/ligand interactions. A polynucleotide of the 
invention can encode a polypeptide exhibiting such characteristics. Examples of such receptors 
and ligands include, without limitation, cytokine receptors and their ligands, receptor kinases and 
35 their ligands, receptor phosphatases and their ligands, receptors involved in cell-cell interactions 
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and their ligands (including without limitation, cellular adhesion molecules (such as selecting 
integrins and their ligands) and receptor/ligand pairs involved in antigen presentation, antigen 
recognition and development of cellular and humoral immune responses. Receptors and ligands 
are also useful for screening of potential peptide or small molecule inhibitors of the relevant 
5 receptor/ligand interaction. A protein of the present invention (including, without limitation, 
fragments of receptors and ligands) may themselves be useful as inhibitors of receptor/ligand 
interactions. 

The activity of a polypeptide of the invention may, among other means, be measured by 
the following methods: 

10 Suitable assays for receptor-ligand activity include without limitation those described in: 

Cvarent Protocols in Immunology, Ed by J. E. Coligan, A. M. Kruisbeek, D. H. Margulies, E. M. ' 
Shevach, W. Strober, Pub. Greene Publishing Associates and Wiley- Interscience (Chapter 7.28, 
Measurement of Cellular Adhesion under static conditions 7.28.1- 7.28.22), Takai et al., Proc. 
Natl. Acad. Sci. USA 84:6864-6868, 1987; Bierer et at, J. Exp. Med. 168:1 1 45-1 156, 1988; 

15 Rosenstein et al., 3. Exp. Med. 169: 149-160 1989; Stoltenborg et al., J. Immunol. Methods 
175:59-68, 1994; Stittet al., Cell 80:661-670, 1995. 

By way of example, the polypeptides of the invention may be used as a receptor for a 
ligand(s) thereby transmitting the biological activity of that ligand(s). Ligands may be identified 
through binding assays, affinity chromatography, dihybrid screening assays, BIAcore assays, gel 

20 overlay assays, or other methods known in the art 

Studies characterizing drugs or proteins as agonist or antagonist or partial agonists or a 
partial antagonist require the use of other proteins as competing ligands. The polypeptides of the 
present invention or ligand(s) thereof may be labeled by being coupled to radioisotopes, 
colorimetric molecules or a toxin molecules by conventional methods. ("Guide to Protein 

25 Purification" Murray P. Deutscher (ed) Methods in Enzymology Vol. 1 82 (1 990) Academic 
Press, Inc. San Diego). Examples of radioisotopes include, but are not limited to, tritium and 
carbon-14 . Examples of colorimetric molecules include, but are not limited to, fluorescent • 
molecules such as fluorescamine, or rhodamine or other colorimetric molecules. Examples of 
toxins include, but are not limited, to ricin. 

30 

4.10.13 DRUG SCREENING 

This invention is particularly useful for screening chemical compounds by using the 
novel polypeptides or binding fragments thereof in any of a variety of drug screening techniques. 
The polypeptides or fragments employed in such a test may either be free in solution, affixed to a 
35 solid support, borne on a cell surface or located intracellular^. One method of drug screening 
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utilizes eukaryotic or prokaryotic host cells which are stably transformed with recombinant 
nucleic acids expressing the polypeptide or a fragment thereof. Drugs are screened against such 
transformed cells in competitive binding assays. Such cells, either in viable or fixed form, can 
be used for standard binding assays. One may measure, for example, the formation of 
5 complexes between polypeptides of the invention or fragments and the agent being tested or 
examine the diminution in complex formation between the novel polypeptides and an 
appropriate cell line, which are well known in the art. 

Sources for test compounds that may be screened for ability to bind to or modulate (i e. , 
increase or decrease) the activity of polypeptides of the invention include (1) inorganic and 
10 organic chemical libraries, (2) natural product libraries, and (3) combinatorial libraries 
comprised of either random or mimetic peptides, oligonucleotides or organic molecules. 

Chemical libraries may be readily synthesized or purchased from a number of 
commercial sources, and may include structural analogs of known compounds or compounds 
that are identified as "hits" or "leads" via natural product screening. 
15 The sources of natural product libraries are microorganisms (including bacteria and 

fungi), animals, plants or other vegetation, or marine organisms, and libraries of mixtures for 
screening may be created by: (1) fermentation and extraction of broths from soil, plant or marine 
microorganisms or (2) extraction of the organisms themselves. Natural product libraries include 
polyketides, non-ribosomal peptides, and (non-naturally occurring) variants thereof. For a 
20 review, see Science 252:63-68 (1998). 

Combinatorial libraries are composed of large numbers of peptides, oligonucleotides or 
organic compounds and can be readily prepared by traditional automated synthesis methods, 
PCR, cloning or proprietary synthetic methods. Of particular interest are peptide and 
oligonucleotide combinatorial libraries. Still other libraries of interest include peptide, protein, 
25 peptidomimetic, multiparallel synthetic collection, recombinatorial, and polypeptide libraries. 
For a review of combinatorial chemistry and libraries created therefrom, see Myers, Curr. Opin. 
Biotechnol 8:701-707 (1997). For reviews and examples of peptidomimetic libraries, see 
Al-Obeidi et al., Mol Biotechnol 9(3):205-23 (1998); Hruby et al., Curr Opin Chem Biol, 
1 (1):1 14-19 (1997); Dorner et al., Bioorg Med Chem, 4(5):709-15 (1996) (alkylated dipeptides). 
30 Identification of modulators through use of the various libraries described herein permits 

modification of the candidate "hit" (or "lead") to optimize the capacity of the "hif 5 to bind a 
polypeptide of the invention. The molecules identified in the binding assay are then tested for 
antagonist or agonist activity in in vivo tissue culture or animal models that are well known in the 
ait. In brief, the molecules are titrated into a plurality of cell cultures or animals and then tested 
35 for either cell/animal death or prolonged survival of the animal/cells, 
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The binding molecules thus identified may be completed with toxins, e.g., ricin or 
cholera, or with other compounds that are toxic to cells such as radioisotopes. The toxin-binding 
molecule complex is then targeted to a tumor or other cell by the specificity of the binding 
molecule for a polypeptide of the invention. Alternatively, the binding molecules may be 
5 complexed with imaging agents for targeting and imaging purposes. 

4.10.14 ASSAY FOR RECEPTOR ACTIVITY 

The invention also provides methods to detect specific binding of a polypeptide e.g. a 
ligand or a receptor. The art provides numerous assays particularly useful for identifying 

10 previously unknown binding partners for receptor polypeptides of the invention. For example, 
expression cloning using mammalian or bacterial cells, or dihybrid screening assays can be used 
to identify polynucleotides encoding binding partners. As another example, affinity 
chromatography with the appropriate immobilized polypeptide of the invention can be used to 
isolate polypeptides that recognize and bind polypeptides of the invention. There are a number 

1 5 of different libraries used for the identification of compounds, and in particular small molecules, 
that modulate (Le. t increase or decrease) biological activity of a polypeptide of the invention. 
Ligands for receptor polypeptides of the invention can also be identified by adding exogenous 
ligands, or cocktails of ligands to two cells populations that are genetically identical except for 
the expression of the receptor of the invention: one cell population expresses the receptor of the 

20 invention whereas the other does not. The response of the two cell populations to the addition of 
ligands(s) are then compared. Alternatively, an expression library can be co-expressed with the 
polypeptide of the invention in cells and assayed for an autocrine response to identify potential 
ligand(s). As still another example, BIAcore assays, gel overlay assays, or other methods known 
in the art can be used to identify binding partner polypeptides, including, (1) organic and 

25 inorganic chemical libraries, (2) natural product libraries, and (3) combinatorial libraries 
comprised of random peptides, oligonucleotides or organic molecules. 

The role of downstream intracellular signaling molecules in the signaling cascade of the 
polypeptide of the invention can be determined. For example, a chimeric protein in which the 
cytoplasmic domain of the polypeptide of the invention is fused to the extracellular portion of a 

30 protein, whose ligand has been identified, is produced in a host cell. The cell is then incubated 
with the ligand specific for the extracellular portion of the chimeric protein, thereby activating 
the chimeric receptor. Known downstream proteins involved in intracellular signaling can then 
be assayed for expected modifications ie. phosphorylation. Other methods known to those in the 
art can also be used to identify signaling molecules involved in receptor activity. 
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10 



4.10.15 ANTI-INFLAMMATORY ACTIVITY 

Compositions of the present invention may also exhibit anti-inflammatory activity. The 
anti-inflammatory activity may be achieved by providing a stimulus to cells bvolved in the 
inflammatory response, by inhibiting or promoting cell-cell interactions (such as, for example, 
cell adhesion), by inhibiting or promoting chemotaxis of cells involved in the inflammatory 
process, inhibiting or promoting cell extravasation, or by stimulating or suppressing production 
of other factors which more directly inhibit or promote an inflammatory response. Compositions 
with such activities can be used to treat inflammatory conditions including chronic or acute 
conditions), including without limitation intimation associated with infection (such as septic 
shock, sepsis or systemic inflammatory response syndrome (SIRS)), ischemia-reperfusion injury, 
endotoxin lethality, arthritis, complement-mediated hyperacute rejection, nephritis, cytokine or 
chemokine-induccd lung injury, inflammatory bowel disease, Crohn's disease or resulting from 
over production of cytokines such asTNF or IL-1 . Compositions of the invention may also be 
useful to treat anaphylaxis and hypersensitivity to an antigenic substance or material. 
1 5 Compositions of this invention may be utilized to prevent or treat conditions such as, but not 
limited to, sepsis, acute pancreatitis, endotoxin shock, cytokine induced shock, rheumatoid 
arthritis, chronic inflammatory arthritis, pancreatic cell damage from diabetes mellitus type 1, 
graft versus host disease, inflammatory bowel disease, inflamation associated with pulmonary 
disease, other autoimmune disease or inflammatory disease, an antiproliferative agent such as for 
acute or chronic mylegenous leukemia or in the prevention of premature labor secondary to 
intrauterine infections. 



20 



4.10.16 LEUKEMIAS 

Leukemias and related disorders may be treated or prevented by adnunktoation of a 
25 therapeutic that promotes or inhibits function of the polynucleotides and/or polypeptides of the 
invention. Such leukemias and related disorders include but are not limited to acute leukemia, 
acute lymphocytic leukemia, acute myelocytic leukemia, myeloblastic, promyelocytic, 
myelomonocytic, monocytic, erythroleukemia, chronic leukemia, chronic myelocytic 
(granulocytic) leukemia and chronic lymphocytic leukemia (for a review of such disorders, see 
30 Fishman et al, 1985, Medicine, 2d Ed., J.B. Lippincott Co., Philadelphia). 

4.10.17 NERVOUS SYSTEM DISORDERS 

Nervous system disorders, involving cell types which can be tested for efficacy of 
intervention with compounds that modulate the activity of the polynucleotides and/or 
polypeptides of the invention, and which can be treated upon thus observing an indication of 
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therapeutic utility, include but are not limited to nervous system injuries, and diseases or 
disorders which result in either a disconnection of axons, a diminution or degeneration of 
neurons, or demyelination. Nervous system lesions which may be treated in a patient (including 
human and non-human mammalian patients) according to the invention include but are not 
5 limited to the following lesions of either the central (including spinal cord, brain) or peripheral 
nervous systems: 

(i) traumatic lesions, including lesions caused by physical injury or associated with 
surgery, for example, lesions which sever a portion of the nervous system, or compression 
injuries; 

10 (ii) ischemic lesions, in which a lack of oxygen in a portion of the nervous system 

results in neuronal injury or death, including cerebral inferction or ischemia, or spinal cord 
infarction or ischemia; 

(iii) infectious lesions, in which a portion of the nervous system is destroyed or 
injured as a result of infection, for example, by an abscess or associated with infection by human 

15 immunodeficiency vims, herpes zoster, or herpes simplex virus or with Lyme disease, 
tuberculosis, syphilis; 

(iv) degenerative lesions, in which a portion of the nervous system is destroyed ot 
injured as a result of a degenerative process including but not limited to degeneration associated 
with Parkinson's disease, Alzheimer's disease, Huntington's chorea, or amyotrophic lateral 

20 sclerosis; 

(v) lesions associated with nutritional diseases or disorders, in which a portion of the 
nervous system is destroyed or injured by a nutritional disorder or disorder of metabolism 
including but not limited to, vitamin B12 deficiency, folic acid deficiency, Wernicke disease, 
tobacco-alcohol amblyopia, Marchiafava-Bignami disease (primary degeneration of the corpus 

25 caHosum), and alcoholic cerebellar degeneration; 

(vi) neurological lesions associated with systemic diseases including but not limited to 
diabetes (diabetic neuropathy, Bell's palsy), systemic lupus erythematosus, carcinoma, or 
sarcoidosis; 

(vii) lesions caused by toxic substances including alcohol, lead, or particular 

30 neurotoxins; and 

(viii) demyelinated lesions in which a portion of the nervous system is destroyed or 
injured by a demyelinating disease including but not limited to multiple sclerosis, human 
immunodeficiency virus-associated myelopathy, transverse myelopathy or various etiologies, 
progressive multifocal leukoencephalopathy, and central pontine myelinolysis. 
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Therapeutics which are usefiil according to the invention for treatment of a nervous 
system disorder may be selected by testing for biological activity in promoting the survival or 
differentiation of neurons. For example, and not by way of limitation, therapeutics which elicit 
any of the following effects may be useful according to the invention: 
5 (i) increased survival time of neurons in culture; 

(ii) increased sprouting of neurons in culture or in vivo: 

(iii) increased production of a neuron-associated molecule in culture or in vivo, e,g. , 
choline acetyltransferase or acetylcholinesterase with respect to motor neurons; or 

(iv) decreased symptoms of neuron dysfunction in vivo. 

10 Such effects may be measured by any method known in the art. In preferred, 

non-limiting embodiments, increased survival of neurons may be measured by the method set 
forth in Arakawa et al. (1990, J. Neurosci. 10:3507-3515); increased sprouting of neurons may 
be detected by methods set forth in Pestronk et al. (1980, Exp. Neurol. 70:65-82) or Brown et al. 
(198 1, Ann. Rev. Neurosci. 4:1 7-42); increased production of neuron-associated molecules may 

15 be measured by bioassay, enzymatic assay, antibody binding, Northern blot assay, etc., 

depending on the molecule to be measured; and motor neuron dysfunction may be measured by 
assessing the physical manifestation of motor neuron disorder, e.g. , weakness, motor neuron 
conduction velocity, or functional disability. 

In specific embodiments, motor neuron disorders that may be treated according to the 

20 invention include but are not limited to disorders such as infarction, infection, exposure to toxin, 
trauma, surgical damage, degenerative disease or malignancy that may affect motor neurons as 
well as other components of the nervous system, as well as disorders that selectively affect 
neurons such as amyotrophic lateral sclerosis, and including but not limited to progressive spinal 
muscular atrophy, progressive bulbar palsy, primary lateral sclerosis, infantile and juvenile 

25 muscular atrophy, progressive bulbar paralysis of childhood (Fazio-Londe syndrome), 
poliomyelitis and the post polio syndrome, and Hereditary Motorsensory Neuropathy 
(Charcot-Marie-Tooth Disease). 

410.18 OTHER ACTIVITIES 

30 A polypeptide of the invention may also exhibit one or more of the following additional 

activities or effects: inhibiting the growth, infection or function of, or killing, infectious agents, 
including, without limitation, bacteria, viruses, fungi and other parasites; effecting (suppressing 
or enhancing) bodily characteristics, including, without limitation, height, weight, hair color, eye 
color, skin, fat to lean ratio or other tissue pigmentation, or organ or body part size or shape 

3 5 (such as, for example, breast augmentation or diminution, change in bone form or shape); 
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effecting biorhythms or Orcadian cycles or rhythms; effecting the fertility of male or female 
subjects; effecting the metabolism, catabolism, anabolism, processing, utilization, storage or 
elimination of dietary fat, lipid, protein, carbohydrate, vitamins, minerals, co-factors or other 
nutritional factors or components); effecting behavioral characteristics, including, without 

5 limitation, appetite, libido, stress, cognition (including cognitive disorders), depression 

(including depressive disorders) and violent behaviors; providing analgesic effects or other pain 
reducing effects; promoting differentiation and growth of embryonic stem cells in lineages other 
than hematopoietic lineages; hormonal or endocrine activity; in the case of enzymes, correcting 
deficiencies of the enzyme and treating deficiency-related diseases; treatment of 

10 hyperproliferative disorders (such as, for example, psoriasis); immunoglobulin-like activity (such 
as, for example, the ability to bind antigens or complement); and the ability to act as an antigen 
in a vaccine composition to raise an immune response against such protein or another material or 
entity which is cross-reactive with such protein. 

15 4.10.19 IDENTIFICATION OF POLYMORPHISMS 

The demonstration of polymorphisms makes possible the identification of such 
polymorphisms in human subjects and the pharmacogenetic use of this information for diagnosis 
and treatment Such polymorphisms may be associated with, e.g., differential predisposition or 
susceptibility to various disease states (such as disorders involving inflammation or immune 
20 response) or a differential response to drug administration, and this genetic information can be 
used to tailor preventive or therapeutic treatment appropriately. For example, the existence of a 
polymoiphism associated with a predisposition to inflammation or autoimmune disease makes 
possible the diagnosis of this condition in humans by identifying the presence of the 
polymorphism. 

25 Polymorphisms can be identified in a variety of ways known in the art which all 

generally involve obtaining a sample from a patient, analyzing DNA from the sample, optionally 
involving isolation or amplification of the DNA, and identifying the presence of the 
polymorphism in the DNA. For example, PCR may be used to amplify an appropriate fragment 
of genomic DNA which may then be sequenced. Alternatively, the DNA may be subjected to 

30 allele-specific oligonucleotide hybridization (in which appropriate oligonucleotides are 

hybridized to the DNA under conditions permitting detection of a single base mismatch) or to a 
single nucleotide extension assay (in which an oligonucleotide that hybridizes immediately 
adjacent to the position of the polymorphism is extended with one or more labeled nucleotides). 
In addition, traditional restriction fragment length polymorphism analysis (using restriction 

35 enzymes that provide differential digestion of the genomic DNA depending on the presence or 
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absence of the polymorphism) may be performed. Arrays with nucleotide sequences of the 
present invention can be used to detect polymorphisms. The array can comprise modified 
nucleotide sequences of the present invention in order to detect the nucleotide sequences of the 
present invention. In the alternative, any one of the nucleotide sequences of the present 
5 invention can be placed on the array to detect changes from those sequences. 

Alternatively a polymorphism resulting in a change in the amino acid seaueuce could 
also be detected by detecting a corresponding change in amino acid sequence of the protein, e.g. , 
by an antibody specific to the variant sequence. 

10 4.10.20 ARTHRITIS AND INFLAMMATION 

The immunosuppressive effects of the compositions of the invention against rheumatoid 
arthritis is determined in an experimental animal model system. The experimental model system 
is adjuvant induced arthritis in rats, and the protocol is described by J. Holoshitz, et at, 1983, 
Science, 219:56, orbyB. WaksmanetaL, 1963, Int. Arch, Allergy AppL Immunol., 23:129. 

15 Induction of the disease can be caused by a single injection, generally intradermally, of a 
suspension of killed Mycobacterium tuberculosis in complete Freunds adjuvant (CFA). The 
route of injection can vary, but rats may be injected at the base of the tail with an adjuvant 
mixture. The polypeptide is administered in phosphate buffered solution (PBS) at a dose of about 
1-5 mg/kg. The control consists of administering PBS only. 

20 The procedure for testing the effects of the test compound would consist of intradennally 

injecting killed Mycobacterium tuberculosis in CFA followed by immediately administering the 
test compound and subsequent treatment every other day until day 24. At 14, 15, 1 8 5 20, 22, and 
24 days after injection of Mycobacterium CFA, an overall arthritis score may be obtained as 
described by J. Holoskitz above. An analysis of the data would reveal that the test compound 

25 would have a dramatic affect on the swelling of the joints as measured by a decrease of the 
arthritis score. 

4.11 THERAPEUTIC METHODS 

The compositions (including polypeptide fragments, analogs, variants and antibodies or 
30 other binding partners or modulators including antisense polynucleotides) of the invention have 
numerous applications in a variety of therapeutic methods. Examples of therapeutic applications 
include, but are not limited to, those exemplified herein. 

4.11.1 EXAMPLE 



62 




WO 01/57188 PCT/US01/03800 

One embodiment of the invention is the administration of an effective amount of the 
polypeptides or other composition of the invention to individuals affected by a disease or 
disorder that can be modulated by regulating the peptides of the invention. While the mode of 
administration is not particularly important, parenteral administration is preferred. An 

5 exemplary mode of administration is to deliver an intravenous bolus. The dosage of the 
polypeptides or other composition of the invention will normally be determined by the 
prescribing physician. It is to be expected that the dosage will vary according to the age, weight, 
condition and response of the individual patient. Typically, the amount of polypeptide 
administered per dose will be in the range of about O.Olfxg/kg to 100 mg/kg of body weight, with 

10 the preferred dose being about O.ljig/kg to 10 mg/kg of patient body weight. For parenteral 

administration, polypeptides of the invention will be formulated in an injectable form combined 
with a pharmaceutically acceptable parenteral vehicle. Such vehicles are well known in the art 
and examples include water, saline, Ringer's solution, dextrose solution, and solutions consisting 
of small amounts of the human serum albumin. The vehicle may contain minor amounts of 

15 additives that maintain the isotonicity and stability of the polypeptide or other active ingredient. 
The preparation of such solutions is within the skill of the art. 

4.12 PHARMACEUTICAL FORMULATIONS AND ROUTES OF 
ADMINISTRATION 

20 A protein or other composition of the present invention (from whatever source derived, 

including without limitation from recombinant and non-recombinant sources and including 
antibodies and other binding partners of the polypeptides of the invention) may be administered 
to a patient in need, by itself, or in pharmaceutical compositions where it is mixed with suitable 
carriers or excipient(s) at doses to treat or ameliorate a variety of disorders. Such a composition 

25 may optionally contain (in addition to protein or other active ingredient and a carrier) diluents, 
fillers, salts, buffers, stabilizers, solubilizers, and other materials well known in the art. The term 
"pharmaceutically acceptable" means a non-toxic material that does not interfere with the 
effectiveness of the biological activity of the active ingredients). The characteristics of the 
carrier will depend on the route of administration. The pharmaceutical composition of the 

30 invention may also contain cytokines, lymphokines, or other hematopoietic factors such as 

M-CSF, GM-CSF, TNF, IL-I, IL-2, IL-3, IL-4, IL-5, IL-6, IL-7, IL-8, IL-9, IL-10, IL-1 1, IL-12, 
IL-13, IL-14, IL-1 5, IFN, TNFO, TNF1, TNF2, G-CSF, Meg-CSF, thrombopoietin, stem cell 
factor, and erythropoietin. In further compositions, proteins of the invention may be combined 
with other agents beneficial to the treatment of the disease or disorder in question. These agents 

35 include various growth factors such as epidermal growth factor (EGF), platelet-derived growth 
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factor (PDGF), transforming growth factors (TGF-a and TGF-P), insulin-like growth factor 
(IGF), as well as cytokines described herein. 

The pharmaceutical composition may further contain other agents which either enhance 
the activity of the protein or other active ingredient or complement its activity or use in 
5 treatment. Such additional factors and/or agents may be included in the pharmaceutical 
composition to produce a synergistic effect with protein or other active ingredient of the 
invention, or to minimize side effects. Conversely, protein or other active ingredient of the 
present invention may be included in formulations of the particular clotting factor, cytokine, 
lymphokine, other hematopoietic factor, thrombolytic or anti-thrombotic factor, or anti- 
10 inflammatory agent to minimize side effects of the clotting factor, cytokine, lymphokine, other 
hematopoietic factor, thrombolytic or anti-thrombotic factor, or anti-inflammatory agent (such as 
IL-IRa, IL-1 Hyl, IL-1 Hy2, anti-TNF, corticosteroids, immunosuppressive agents). Aprotein 
of the present invention may be active in multimers (e.g., heterodimers or hcmwdimers) or 
complexes with itself or other proteins. As a result, pharmaceutical compositions of the 
15 invention may comprise aprotein of the invention in such multimeric or complexed form. 

As an alternative to being included in a pharmaceutical composition of the invention 
including a first protein, a second protein or a therapeutic agent may be concurrently 
administered with the first protein (e.g., at the same time, or at differing times provided that 
therapeutic concentrations of the combination of agents is achieved at the treatment site). 
20 Techniques for formulation and administration of the compounds of the instant application may 
be found in "Remington's Pharmaceutical Sciences," Mack Publishing Co., Easton, PA latest 
edition. A therapeutically effective dose further refers to that amount of the compound sufficient 
to result in amelioration of symptoms, e.g., treatment, healing, prevention or amelioration of the 
relevant medical condition, or an increase in rate of treatment, healing, prevention or 
25 amelioration of such conditions. When applied to an individual active ingredient, administered 
alone, a therapeutically effective dose refers to that ingredient alone. When applied to a 
combination, a therapeutically effective dose refers to combined amounts of the active 
ingredients that result in the therapeutic effect, whether administered in combination, serially or 



30 In practicing the method of treatment or use of the present invention, a therapeutically 

effective amount of protein or other active ingredient of the present invention is a<iministeredto 
a mammal having a condition to be treated. Protein or other active ingredient of the present 
invention may be administered in accordance with the method of the invention either alone or in 
combination with other therapies such as treatments employing cytokines, lymphokines or other 

35 hematopoietic factors. When co- administered with one or more cytokines, lymphokines or other 
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hematopoietic factors, protein or other active ingredient of the present invention may be 
administered either simultaneously with the cytokine(s), lympholdne(s), other hematopoietic 
fector(s), thrombolytic or antithrombotic factors, or sequentially. If administered sequentially, 
the attending physician will decide on the appropriate sequence of administering protein or other 
5 active ingredient of the present invention in combination with cytokine(s), lymphokine(s), other 
hematopoietic factors), thrombolytic or anti-thrombotic factors. 

4.12.1 ROUTES OF ADMINISTRATION 

Suitable routes of administration may, for example, include oral, rectal, transmucosal, or 
10 intestinal administration; parenteral delivery, including intramuscular, subcutaneous, 
intramedullary injections, as well as intrathecal, direct intraventricular, intravenous, 
intraperitoneal, intranasal, or intraocular injections. Administration of protein or other active 
ingredient of the present invention used in the pharmaceutical composition or to practice the 
method of the present invention can be carried out in a variety of conventional ways, such as oral 
1 5 ingestion, inhalation, topical application or cutaneous, subcutaneous, intraperitoneal, parenteral 
or intravenous injection. Intravenous administration to the patient is preferred. 

Alternately, one may administer the compound in a local rather than systemic manner, for 
example, via injection of the compound directly into a arthritic joints or in fibrotic tissue, often 
in a depot or sustained release formulation. In order to prevent the scarring process frequently 
20 occurring as complication of glaucoma surgery, the compounds rnay be administered topically, 
for example, as eye drops. Furthermore, one may administer the drug in a targeted drug delivery 
system, for example, in a liposome coated with a specific antibody, targeting, for example, 
airthritic or fibrotic tissue. The liposomes will be targeted to and taken up selectively by the 
afflicted tissue. 

25 The polypeptides of the invention are administered by any route that delivers an effective 

dosage to the desired site of action. The determination of a suitable route of administration and 
an effective dosage for a particular indication is within the level of skill in the art. Preferably for 
wound treatment, one administers the therapeutic compound directly to the site. Suitable dosage 
ranges for the polypeptides of the invention can be extrapolated from these dosages or from 

30 similar studies in appropriate animal models. Dosages can then be adjusted as necessary by the 
clinician to provide maximal therapeutic benefit 

4.12.2 COMPOSITIONS/FORMULATIONS 

Pharmaceutical compositions for use in accordance with the present invention thus may 
35 be formulated in a conventional manner using one or more physiologically acceptable carriers 
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comprising excipients and auxiliaries which fecilitate processing of the active compounds into 
preparations which can be used pharmaceutical^. These pharmaceutical compositions may be 
manufactured in a manner that is itself known, e.g., by means of conventional mixing, 
dissolving, granulating, dragee-making, levigating, emulsifying, encapsulating, entrapping or 
5 Iyophilizing processes. Proper formulation is dependent upon the route of administration 
chosen. When a therapeutically effective amount of protein or other active ingredient of the 
present invention is administered orally, protein or other active ingredient of the present 
invention will be in the form of a tablet, capsule, powder, solution or elixir. When administered 
in tablet form, the pharmaceutical composition of the invention may additionally contain a solid 
10 carrier such as a gelatin or an adjuvant. The tablet, capsule, and powder contain from about 5 to 
95% protein or other active ingredient of the present invention, and preferably from about 25 to 
90% protein or other active ingredient of the present invention. When administered in liquid 
form, a liquid carrier such as water, petroleum, oils of animal or plant origin such as peanut oil, 
mineral oil, soybean oil, or sesame oil, or synthetic oils may be added. The liquid form of the 
1 5 pharmaceutical composition may further contain physiological saline solution, dextrose or other 
saccharide solution, or glycols such as ethylene glycol, propylene glycol or polyethylene glycol. 
When administered in liquid form, the pharmaceutical composition contains from about 0.5 to 
90% by weight of protein or other active ingredient of the present invention, and preferably from 
about 1 to 50% protein or other active ingredient of the present invention. 
20 When a therapeutically effective amount of protein or other active ingredient of the 

present invention is administered by intravenous, cutaneous or subcutaneous injection, protein or 
other active ingredient of the present invention will be in the form of a pyrogen-free, parenterally 
acceptable aqueous solution. The preparation of such parenterally acceptable protein or other 
active ingredient solutions, having due regard to pH, isotonicity, stability, and the like, is within 
25 the skill in the art A preferred pharmaceutical composition for intravenous, cutaneous, or 
subcutaneous injection should contain, in addition to protein or other active ingredient of the 
present invention, an isotonic vehicle such as Sodium Chloride Injection, Ringer's Injection, 
Dextrose Injection, Dextrose and Sodium Chloride Injection, Lactated Ringer's Injection, or 
other vehicle as known in the art. The pharmaceutical composition of the present invention may 
30 also contain stabilizers, preservatives, buffers, antioxidants, or other additives known to those of 
skill in the art. For injection, the agents of the invention may be formulated in aqueous 
solutions, preferably in physiologically compatible buffers such as Hanks's solution, Ringer's 
solution, or physiological saline buffer. For transmucosal administration, penetrants appropriate 
to the barrier to be permeated are used in the formulation. Such penetrants are generally known 
35 in the art. 
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For oral administration, the compounds can be formulated readily by combining the 
active compounds with pharmaceutical acceptable carriers well known in the art. Such earners 
enable the compounds of the invention to be formulated as tablets, pills, dragees, capsules, 
liquids, gels, syrups, slurries, suspensions and the like, for oral ingestion by a patient to be 
5 treated. Pharmaceutical preparations for oral use can be obtained from a solid excipient, 
optionally grinding a resulting mixture, and processing the mixture of granules, after adding 
suitable auxiliaries, if desired, to obtain tablets or dragee cores. Suitable excipients are, in 
particular, fillers such as sugars, including lactose, sucrose, mannitol, or sorbitol; cellulose 
preparations such as, for example, maize starch, wheat starch, rice starch, potato starch, gelatin, 
10 gum tragacanth, methyl cellulose, hydroxypropylmethyl-cellulose, sodium 

carboxymethyicellulose, and/or polyvinylpyrrolidone (PVP). If desired, disintegrating agents 
may be added, such as the cross-linked polyvinyl pyrrolidone, agar, or alginic acid or a salt 
thereof such as sodium alginate. Dragee cores are provided with suitable coatings. For this 
purpose, concentrated sugar solutions may be used, which may optionally contain gum arabic, 
15 talc, polyvinyl pyrrolidone, carbopol gel, polyethylene glycol, and/or titanium dioxide, lacquer 
solutions, and suitable organic solvents or solvent mixtures. Dyestufis or pigments may be 
added to the tablets or dragee coatings for identification or to characterize different combinations 
of active compound doses. 

Pharmaceutical preparations which can be used orally include push-fit capsules made of 
20 gelatin, as well as soft, sealed capsules made of gelatin and a plasticizer, such as glycerol or 

sorbitol. The push-fit capsules can contain the active ingredients in admixture with filler such as 
lactose, binders such as starches, and/or lubricants such as talc or magnesium stearate and, 
optionally, stabilizers. In soft capsules, the active compounds may be dissolved or suspended in 
suitable liquids, such as fatty oils, liquid paraffin, or liquid polyethylene glycols. In addition, 
25 stabilizers may be added. All formulations for oral administration should be in dosages suitable 
for such administration. For buccal administration, the compositions may take the form of 
tablets or lozenges formulated in conventional manner. 

For administration by inhalation, the compounds for use according to the present 
invention are conveniently delivered in the form of an aerosol spray presentation from 
30 pressurized packs or a nebuliser, with the use of a suitable propellant, e.g. , 

dichlorodifluoromethane, trichlorofluoromethane, dichlorotetrafluoroethane, carbon dioxide or 
other suitable gas. In the case of a pressurized aerosol the dosage unit may be determined by 
providing a valve to deliver a metered amount Capsules and cartridges of, e.g. , gelatin for use 
in an inhaler or insufflator may be formulated containing a powder mix of the compound and a 
35 suitable powder base such as lactose or starch. The compounds may be formulated for parenteral 
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administration by injection, e.g., by bolus injection or continuous infusion. Formulations for 
injection may be presented in unit dosage form, e.g., in ampules or in multi-dose containers, with 
an added preservative. The compositions may take such forms as suspensions, solutions or 
emulsions in oily or aqueous vehicles, and may contain formulatory agents such as suspending, 
5 stabilizing and/or dispersing agents. 

Pharmaceutical formulations for parenteral administration include aqueous solutions of 
the active compounds in water-soluble form. Additionally, suspensions of the active compounds 
may be prepared as appropriate oily injection suspensions. Suitable lipophilic solvents or 
vehicles include fatty oils such as sesame oil, or synthetic fatty acid esters, such as ethyl oleate or 
1 0 triglycerides, or liposomes. Aqueous injection suspensions may contain substances which 
increase the viscosity of the suspension, such as sodium carboxymethyl cellulose, sorbitol, or 
dextran. Optionally, the suspension may also contain suitable stabilizers or agents which 
increase the solubility of the compounds to allow for the preparation of highly concentrated 
solutions. Alternatively, the active ingredient may be in powder form for constitution with a 

1 5 suitable vehicle, e.g. , sterile pyrogen-free water, before use. 

The compounds may also be formulated in rectal compositions such as suppositories or 
retention enemas, e.g., containing conventional suppository bases such as cocoa butter or other 
glycerides. In addition to the formulations described previously, the compounds may also be 
formulated as a depot preparation. Such long acting formulations may be administered by 

20 implantation (for example subcutaneously or intramuscularly) or by intramuscular injection. 
Thus, for example, the compounds may be formulated with suitable polymeric or hydrophobic 
materials (for example as an emulsion in an acceptable oil) or ion exchange resins, or as 
sparingly soluble derivatives, for example, as a sparingly soluble salt. 

A pharmaceutical carrier for the hydrophobic compounds of the invention is a co-solvent 

25 system comprising benzyl alcohol, a nonpolar surfactant, a water-miscible organic polymer, and 
an aqueous phase. The co-solvent system may be the VPD co-solvent system. VPD is a solution 
of 3% w/v benzyl alcohol, 8% w/v of the nonpolar surfactant polysorbate 80, and 65% w/v 
polyethylene glycol 300, made up to volume in absolute ethanol. The VPD co-solvent system 
(VPD:5W) consists of VPD diluted 1:1 with a 5% dextrose in water solution. This co-solvent 

30 system dissolves hydrophobic compounds well, and itself produces low toxicity upon systemic 
administration. Naturally, the proportions of a co-solvent system may be varied considerably 
without destroying its solubility and toxicity characteristics. Furthermore, the identity of the 
co-solvent components may be varied: for example, other low-toxicity nonpolar surfactants may 
be used instead of polysorbate 80; the fraction size of polyethylene glycol may be varied; other 

35 biocompatible polymers may replace polyethylene glycol, e.g. polyvinyl pyrrolidone; and other 
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sugars or polysaccharides may substitute for dextrose. Alternatively, other delivery systems for 
hydrophobic pharmaceutical compounds may be employed. Liposomes and emulsions are well 
known examples of delivery vehicles or carriers for hydrophobic drugs. Certain organic solvents 
such as dimethylsulfoxide also may be employed, although usually at the cost of greater toxicity. 

5 Additionally, the compounds may be delivered using a sustained-release system, such as 
semipermeable matrices of solid hydrophobic polymers containing the therapeutic agent. 
Various types of sustained-release materials have been established and are well known by those 
skilled in the art. Sustained-release capsules may, depending on their chemical nature, release the 
compounds for a few weeks up to over 100 days. Depending on the chemical nature and the 

10 biological stability of the therapeutic reagent, additional strategies for protein or other active 
ingredient stabilization may be employed. 

The pharmaceutical compositions also may comprise suitable solid or gel phase carriers 
or excipients. Examples of such carriers or excipients include but are not limited to calcium 
carbonate, calcium phosphate, various sugars, starches, cellulose derivatives, gelatin, and 

15 polymers such as polyethylene glycols. Many of the active ingredients of the invention may be 
provided as salts with pharmaceutically compatible counter ions. Such pharmaceutically 
acceptable base addition salts are those salts which retain the biological effectiveness and 
properties of the free acids and which are obtained by reaction with inorganic or organic bases 
such as sodium hydroxide, magnesium hydroxide, ammonia, trialkylamine, dialkylamine, 

20 monoalkylainine, dibasic amino acids, sodium acetate, potassium benzoate, Lrieihanol amine and 
the like. 

The pharmaceutical composition of the invention may be in the form of a complex of the 
protein(s) or other active ingredients) of present invention along with protein or peptide 
antigens. The protein and/or peptide antigen will deliver a stimulatory signal to both B and T 

25 lymphocytes. B lymphocytes will respond to antigen through their surface immunoglobulin 
receptor. T lymphocytes will respond to antigen through the T cell receptor (TCR) following 
presentation of the antigen by MHC proteins. MHC and structurally related proteins including 
those encoded by class 1 and class II MHC genes on host cells will serve to present the peptide 
antigen(s) to T lymphocytes. The antigen components could also be supplied as purified 

30 MHC-peptide complexes alone or with co-stimulatory molecules that can directly signal T cells. 
Alternatively antibodies able to bind surface immunoglobulin and other molecules on B cells as 
well as antibodies able to bind the TCR and other molecules on T cells can be combined with the 
pharmaceutical composition of the invention. 

The pharmaceutical composition of the invention may be in the form of a liposome in 

35 . which protein of the present invention is combined, in addition to other pharmaceutically 
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acceptable carriers, with amphipathic agents such as lipids which exist in aggregated form as 
micelles, insoluble monolayers, liquid crystals, or lamellar layers in aqueous solution. Suitable 
lipids for liposomal fonnulation include, without limitation, monoglycerides, diglycerides, 
sulfatides, lysolecithins, phospholipids, saponin, bile acids, and the like. Preparation of such 
5 liposomal formulations is within the level of skill in the art, as disclosed, for example, in U.S. 
Patent Nos. 4,235,871 ; 4,501,728; 4,837,028; and 4,737,323, all of which are incorporated 
herein by reference. 

The amount of protein or other active ingredient of the present invention in the 
pharmaceutical composition of the present invention will depend upon the nature and severity of 

10 the condition being treated, and on the nature of prior treatments which the patient has 

undergone. Ultimately, the attending physician will decide the amount of protein or other active 
ingredient of the present invention with which to treat each individual patient. Initially, the 
attending physician will administer low doses of protein or other active ingredient of the present 
invention and observe the patient's response. Larger doses of protein or.other active ingredient 

15 of the present invention may be administered until the optimal therapeutic effect is obtained for 
the patient, and at that point the dosage is not increased further, ft is contemplated that the 
various pharmaceutical compositions used to practice the method of the present invention should 
contain about 0.01 \ig to about 100 mg (preferably about 0. 1 \ig to about 10 mg, more preferably 
about 0. 1 jig to about 1 mg) of protein or other active ingredient of the present invention per kg 

20 body weight. For compositions of the present invention which are useful for bone, cartilage, 
tendon or ligament regeneration, the therapeutic method includes administering the composition 
topically, systematically, or locally as an implant or device. When administered, the therapeutic 
composition for use in this invention is, of course, in a pyiogen-free, physiologically acceptable 
form. Further, the composition may desirably be encapsulated or injected in a viscous form for 

25 delivery to the site of bone, cartilage or tissue damage. Topical administration may be suitable 
for wound healing and tissue repair. Therapeutically useful agents other than a protein or other 
active ingredient of the invention which may also optionally be included in the composition as 
described above, may alternatively or additionally, be administered simultaneously or 
sequentially with the composition in the methods of the invention. Preferably for bone and/or 

30 cartilage formation, the composition would include a matrix capable of delivering the 

protein-containing or other active ingredient-containing composition to the site of bone and/or 
cartilage damage, providing a structure for the developing bone and cartilage and optimally 
capable of being resorbed into the body. Such matrices may be formed of materials presently in 
use for other implanted medical applications. 
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The choice of matrix material is based on biocompatibility, biodegradability, mechanical 
properties, cosmetic appearance and interfece properties. The particular application of the 
compositions will define the appropriate formulation. Potential matrices for the compositions 
may be biodegradable and chemically defined calcium sulfate, tricalcium phosphate, 

5 hydroxyapatite, polylactic acid, polyglycolic acid and polyanhydrides. Other potential materials 
are biodegradable and biologically well-defined, such as bone or dermal collagen. Further 
matrices are comprised of pure proteins or extracellular matrix components. Other potential 
matrices are nonbiodegradable and chemically defined, such as sintered hydroxyapatite, bioglass, 
aluminates, or other ceramics. Matrices may be comprised of combinations of any of the above 

10 mentioned types of material, such as polylactic acid and hydroxyapatite or collagen and 
tricalcium phosphate. The bioceramics may be altered in composition, such as in 
calcimn-duminate-phosphate and processing to alter pore size, particle size, particle shape, and 
biodegradability. Presently preferred is a 50:50 (mole weight) copolymer of lactic acid and 
glycolic acid in the form of porous particles having diameters ranging from 150 to 800 microns. 

15 In some applications, it will be useful to utilize a sequestering agent, such as carboxymethyl 

cellulose or autologous blood clot, to prevent the protein compositions from disassociating from 
the matrix. 

A preferred family of sequestering agents is cellulosic materials such as alkylcelluloses 
(including hydroxyalkylcelluloses), including methylcellulose, ethylcellulose, 

20 hydroxyethylcellulose, hydroxypropylcellulose, hydroxypropyl-methylcellulose, and 

carboxymethylcellulose, the most preferred being cationic salts of carboxymethylcellulose 
(CMC). Other preferred sequestering agents include hyaluronic acid, sodium alginate, 
polyethylene glycol), polyoxyethylene oxide, carboxyvinyl polymer and poly(vinyl alcohol). 
The amount of sequestering agent useful herein is 0.5-20 wt %, preferably 1-10 wt % based on 

25 total formulation weight, which represents the amount necessary to prevent desorption of the 
protein from the polymer matrix and to provide appropriate handling of the composition, yet not 
so much that the progenitor cells are prevented from infiltrating the matrix, thereby providing the 
protein the opportunity to assist the osteogenic activity of the progenitor cells. In further 
compositions, proteins or other active ingredients of the invention may be combined with other 

30 agents beneficial to the treatment of the bone and/or cartilage defect, wound, or tissue in 

question. These agents include various growth factors such as epidermal growth factor (EGF), 
platelet derived growth factor (PDGF), transforming growth factors (TGF-ct and TGF-p), and 
insulin-like growth factor (IGF). 

The therapeutic compositions are also presently valuable for veterinary applications. 

35 Particularly domestic animals and thoroughbred horses, in addition to humans, are desired 



71 



WO 01/57188 



PCT/US01/03800 



patients for such treatment with proteins or other active ingredients of the present invention. The 
dosage regimen of a protein-containing pharmaceutical composition to be used in tissue 
regeneration will be determined by the attending physician considering various factors which 
modify the action of the proteins, e,g. f amount of tissue weight desired to be formed, the site of 
5 damage, the condition of the damaged tissue, the size of a wound, type of damaged tissue (e.g., 
bone), the patient's age, sex, and diet, the severity of any infection, time of administration and 
other clinical factors. The dosage may vary with the type of matrix used in the reconstitution 
and with inclusion of other proteins in the pharmaceutical composition. For example, the 
addition of other known growth factors, such as IGF I (insulin like growth factor I), to the final 

1 0 composition, may also effect the dosage. Progress can be monitored by periodic assessment of 
tissue/bone growth and/or repair, for example, X-rays, histomorphometric determinations and 
tetracycline labeling. 

Polynucleotides of the present invention can also be used for gene therapy. Such 
polynucleotides can be introduced either in vivo or ex vivo into cells for expression in a 

1 5 mammalian subject. Polynucleotides of the invention may also be administered by other known 
methods for introduction of nucleic acid into a cell or organism (including, without limitation, in 
the form of viral vectors or naked DNA). Cells may also be cultured ex vivo in the presence of 
proteins of the present invention in order to proliferate or to produce a desired effect on or 
activity in such cells. Treated cells can then be introduced in vivo for therapeutic purposes. 

20 

4.12.3 EFFECTIVE DOSAGE 

Pharmaceutical compositions suitable for use in the present invention include 
compositions wherein the active ingredients are contained in an effective amount to achieve its 
intended purpose. More specifically, a therapeutically effective amount means an amount 

25 effective to prevent development of or to alleviate the existing symptoms of the subject being 
treated. Determination of the effective amount is well within the capability of those skilled in 
the art, especially in light of the detailed disclosure provided herein. For any compound used in 
the method of the invention, the therapeutically effective dose can be estimated initially from 
appropriate in vitro assays. For example, a dose can be formulated in animal models to achieve a 

30 circulating concentration range that can be used to more accurately determine useful doses in 
humans. For example, a dose can be formulated in animal models to achieve a circulating 
concentration range that includes the IC 50 as determined in cell culture (i.e., the concentration of 
the test compound which achieves a half-maximal inhibition of the protein's biological activity). 
Such information can be used to more accurately determine useful doses in humans. 
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A therapeutically effective dose refers to that amount of the compound that results in 
amelioration of symptoms or a prolongation of survival in a patient Toxicity and therapeutic 
efficacy of such compounds can be determined by standard pharmaceutical procedures in ceil 
cultures or experimental animals, e.g., for determining the LD50 (the dose lethal to 50% of the 

5 population) and the ED50 (the dose therapeutically effective in 50% of the population). The dose 
ratio between toxic and therapeutic effects is the therapeutic index and it can be expressed as the 
ratio between LD 50 and ED 50 . Compounds which exhibit high therapeutic indices are preferred. 
The data obtained from these cell culture assays and animal studies can be used in formulating a 
range of dosage for use in human. The dosage of such compounds ties preferably within a range 

10 of circulating concentrations that include the ED50 with little or no toxicity. The dosage may 

vary within this range depending upon the dosage form employed and the route of administration 
utilized. The exact formulation, route of administration and dosage can be chosen by the 
individual physician in view of the patient 1 s condition. See, e.g., Fingl et al. 3 1975, in "The 
Pharmacological Basis of Therapeutics", Ch. 1 p. 1 . Dosage amount and interval may be adjusted 

1 5 individually to provide plasma levels of the active moiety which are sufficient to maintain the 
desired effects, or minimal effective concentration (MEC). The MEC will vary for each 
compound but can be estimated from in vitro data. Dosages necessary to achieve the MEC will 
depend on individual characteristics and route of administration. However, HPLC assays or 
bioassays can be used to determine plasma concentrations. 

20 Dosage intervals can also be determined using MEC value. Compounds should be 

administered using a regimen which maintains plasma levels above the MEC for 1 0-90% of the 
time, preferably between 30-90% and most preferably between 50-90%. In cases of local 
administration or selective uptake, the effective local concentration of the drug may not be 
related to plasma concentration. 

25 An exemplary dosage regimen for polypeptides or other compositions of the invention 

will be in the range of about 0.01 \ig/kg to 100 mg/kg of body weight daily, with the preferred 
dose being about 0.1 fig/kg to 25 mg/kg of patient body weight daily, varying in adults and 
children. Dosing may be once daily, or equivalent doses may be delivered at longer or shorter 
intervals. 

30 The amount of composition administered will, of course, be. dependent on the subject 

being treated, on the subject's age and weight, the severity of the affliction, the manner of 
administration and the judgment of the prescribing physician. 

4.12.4 PACKAGING 
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The compositions may, if desired, be presented in a pack or dispenser device which may 
contain one or more unit dosage forms containing the active ingredient. The pack may, for 
example, comprise metal or plastic foil, such as a blister pack. The pack or dispenser device may 
be accompanied by instructions for administration. Compositions comprising a compound of the 
5 invention formulated in a compatible pharmaceutical carrier may also be prepared, placed in an 
appropriate container, and labeled for treatment of an indicated condition. 

4.13 ANTIBODIES 

Also included in the invention are antibodies to proteins, or fragments of proteins of the 

10 invention. The term "antibody" as used herein refers to immunoglobulin molecules and 

immunologically active portions of immunoglobulin (Ig) molecules, ie. , molecules that contain 
an antigen binding site that specifically binds (immunoreacts with) an antigen. Such antibodies 
include, but are not limited to, polyclonal, monoclonal, chimeric, single chain, F ab , F ab . and F m2 
fragments, and an F ab expression library. In general, an antibody molecule obtained from 

15 humans relates to any of the classes IgG, IgM, IgA, IgE and IgD, which differ from one another 
by the nature of the heavy chain present in the molecule. Certain classes have subclasses as well, 
such as IgGi, IgG 2 , and others. Furthermore, in humans, the light chain may be a kappa chain or 
a lambda chain. Reference herein to antibodies includes a reference to all such classes, 
subclasses and types of human antibody species. 

20 An isolated related protein of the invention may be intended to serve as an antigen, or a 

portion or fragment thereof, and additionally can be used as an immunogen to generate 
antibodies that immunospecifically bind the antigen, using standard techniques for polyclonal 
and monoclonal antibody preparation. The full-length protein can be used or, alternatively, the 
invention provides antigenic peptide fragments of the antigen for use as immunogens. An 

25 antigenic peptide fragment comprises at least 6 amino acid residues of the amino acid sequence 
of the full length protein, (for example the amino acid sequence shown in SEQ ID NO: 1351), 
and encompasses an epitope thereof such that an antibody raised against the peptide forms a 
specific immune complex with the full length protein or with any fragment that contains the 
epitope. Preferably, the antigenic peptide comprises at least 10 amino acid residues, or at least 

30 1 5 amino acid residues, or at least 20 amino acid residues, or at least 30 amino acid residues. 
Preferred epitopes encompassed by the antigenic peptide are regions of the protein that are 
located on its surface; commonly these are hydrophilic regions. 

In certain embodiments of the invention, at least one epitope encompassed by the 
antigenic peptide is a region of -related protein that is located on the surface of the protein, e.g., a 

35 hydrophilic region. A hydrophobicity analysis of the human related protein sequence will 
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indicate which regions of a related protein are particularly hydrophilic and, therefore, are likely 

to encode surface residues useful for targeting antibody production. As a means for targeting 

antibody production, hydropathy plots showing regions of hydrophilicity and hydrophobicity 

may be generated by any method well known in the art, including, for example, the Kyte 
5 DoolMe or the Hopp Woods methods, either with or without Fourier transformation. See, e.g. , 

Hopp and Woods, 198 1, Proc. Nat. Acad Set. USA 78: 3 824-3828; Kyte and Doolittle 1982, /. 

Mol Biol 157: 105-142, each of which is incorporated herein by reference in its entirety. 

Antibodies that are specific for one or more domains within an antigenic protein, or derivatives, 

fragments, analogs or homologs thereof, are also provided herein, 
10 A protein of the invention, or a derivative, fragment, analog, homolog or ortholog 

thereof, may be utilized as an immunogen in the generation of antibodies that 

immunospecifically bind these protein components. 

Various procedures known within the art may be used for the production of polyclonal or 

monoclonal antibodies directed against a protein of the invention, or against derivatives, 
1 5 fragments, analogs homologs or orthologs thereof (see, for example, Antibodies: A Laboratory 

Manual, Harlow E, and Lane D, 1988, Cold Spring Harbor Laboratory Press, Cold Spring 

Harbor, NY, incorporated herein by reference). Some of these antibodies are discussed below. 

5.13.1 Polyclonal Antibodies 

20 For the production of polyclonal antibodies, various suitable host animals (e.g. , rabbit, 

goat, mouse or other mammal) may be immunized by one or more injections with the native 
protein, a synthetic variant thereof, or a derivative of the foregoing. An appropriate 
immunogenic preparation can contain, for example, the naturally occurring immunogenic 
protein, a chemically synthesized polypeptide representing the immunogenic protein, or a 

25 recombinant^ expressed immunogenic protein. Furthermore, the protein may be conjugated to 
a second protein known to be immunogenic in the mammal being immunized. Examples of such 
immunogenic proteins include but are not limited to keyhole limpet hemocyanin, serum albumin, 
bovine thyroglobulin, and soybean trypsin inhibitor. The preparation can further include an 
adjuvant. Various adjuvants used to increase the immunological response include, but are not 

30 limited to, Freund's (complete and incomplete), mineral gels (e.g., aluminum hydroxide), surface 
active substances (e.g., lysolecithin, pluronic polyols, polyanions, peptides, oil emulsions, 
dinitrophenol, etc.), adjuvants usable in humans such as Bacille Calmette-Guerin and 
Corynebacterium parvum, or similar immunostimulatory agents. Additional examples of 
adjuvants which can be employed include MPL-TDM adjuvant (monophosphoryl Lipid A, 

35 synthetic trehalose dicorynomycolate). 
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The polyclonal antibody molecules directed against the immunogenic protein can be 
isolated from the mammal (e.g., from the blood) and further purified by well known techniques, 
such as affinity chromatography using protein A or protein G, which provide primarily the IgG 
fraction of immune serum. Subsequently, or alternatively, the specific antigen which is the 
target of the immunoglobulin sought, or an epitope thereof, may be immobilized on a column to 
purify the immune specific antibody by immunoaffinity chromatography. Purification of 
immunoglobulins is discussed, for example, by D. Wilkinson (The Scientist, published by The 
Scientist, Inc., Philadelphia PA, Vol. 14, No. 8 (April 17, 2000), pp. 25-28). 

513.2 Monoclonal Antibodies 

The term "monoclonal antibody" (MAb) or "monoclonal antibody composition", as used 
herein, refers to a population of antibody molecules that contain only one molecular species of 
antibody molecule consisting of a unique light chain gene product and a unique heavy chain 
gene product In particular, the complementarity detemiining regions (CDRs) of the monoclonal 
antibody are identical in all the molecules of the population. MAbs thus contain an antigen 
binding site capable of immunoreacting with a particular epitope of the antigen characterized by 
a unique binding affinity for it. 

Monoclonal antibodies can be prepared using hybridoma methods, such as those 
described by Kohler and Milstein, Nature. 256:495 (1 975). In a hybridoma method, a mouse, 
hamster, or other appropriate host animal, is typically immunized with an immunizing agent to 
elicit lymphocytes that produce or are capable of producing antibodies that will specifically bind 
to the immunizing agent Alternatively, the lymphocytes can be immunized in vitro. 

The immunizing agent will typically include the protein antigen, a fragment thereof or a 
fusion protein thereof. Generally, either peripheral blood lymphocytes are used if cells of human 
origin are desired, or spleen cells or lymph node cells are used if non-human mammalian sources 
are desired. The lymphocytes are then fused with an immortalized cell line using a suitable 
fusing agent, such as polyethylene glycol, to form a hybridoma cell (Goding, Monoclonal 
Antibodies: Principles and Practice. Academic Press, (1986) pp. 59-103). Immortalized cell 
lines are usually transformed mammalian cells, particularly myeloma cells of rodent, bovine and 
human origin. Usually, rat or mouse myeloma cell lines are employed. The hybridoma cells can 
be cultured in a suitable culture medium that preferably contains one or more substances that 
inhibit the growth or survival of the unfiised, immortalized cells. For example, if the parental 
cells lack the enzyme hypoxanthine guanine phosphoribosyl transferase (HGPRT or HPRT), the 
culture medium for the hybridomas typically will include hypoxanthine, aminopterin, and 
thymidine ("HAT medium"), which substances prevent the growth of HGPRT-deficient cells. 
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Preferred immortalized cell lines are those that fuse efficiently, support stable high level 
expression of antibody by the selected antibody-producing cells, and are sensitive to a medium 
such as HAT medium. More preferred immortalized cell lines are murine myeloma lines, which 
can be obtained, for instance, from the Salk Institute Cell Distribution Center, San Diego, 
5 California and the American Type Culture Collection, Manassas, Virginia. Human myeloma and 
mouse-human heteromyeloma cell lines also have been described for the production of human 
monoclonal antibodies (Kozbor, J, Immunol. 133:3001 (1984); Brodeur et al, Monoclonal 
Antibody Production Techniques and Applications, Marcel Defcker, Inc., New York, (1987) pp. 
51-63). 

1 0 The culture medium in which the hybridoma cells are cultured can then be assayed for 

the presence of monoclonal antibodies directed against the antigen. Preferably, the binding 
specificity of monoclonal antibodies produced by the hybridoma cells is determined by 
immunoprecipitation or by an in vitro binding assay, such as radioimmunoassay (RIA) or 
enzyme-linked immunoabsorbent assay (ELISA). Such techniques and assays are known in the 

15 art The binding affinity of the monoclonal antibody can, for example, be determined by the 
Scatchard analysis of Munson and Pollard, Anal. Biochem.. 107:220 (1 980). Preferably, 
antibodies having a high degree of specificity and a high binding affinity for the target antigen 
are isolated. 

After the desired hybridoma cells are identified, the clones can be subcloned by limiting 

20 dilution procedures and grown by standard methods. Suitable culture media for this purpose 
include, for example, Dulbecco's Modified Eagle's Medium and RPMI-1640 medium. 
Alternatively, the hybridoma cells can be grown in vivo as ascites in a mammal. 

The monoclonal antibodies secreted by the subclones can be isolated or purified from the 
culture medium or ascites fluid by conventional immunoglobulin purification procedures such 

25 as, for example, protein A-Sepharose, hydroxylapatite chromatography, gel electrophoresis, 
dialysis, or affinity chromatography. 

The monoclonal antibodies can also be made by recombinant DNA methods, such as 
those described in U.S. Patent No. 4,816,567. DNA encoding the monoclonal antibodies of the 
invention can be readily isolated and sequenced using conventional procedures (e.g., by using 

30 oligonucleotide probes that are capable of binding specifically to genes encoding the heavy and 
light chains of murine antibodies). The hybridoma cells of the invention serve as a preferred 
source of such DNA. Once isolated, the DNA can be placed into expression vectors, which are 
then transfected into host cells such as simian COS cells, Chinese hamster ovary (CHO) cells, or 
myeloma cells that do not otherwise produce immunoglobulin protein, to obtain the synthesis of 

35 monoclonal antibodies in the recombinant host cells. The DNA also can be modified, for 
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example, by substituting the coding sequence for human heavy and light chain constant domains 
in place of the homologous murine sequences (U.S. Patent No. 4,816,567; Morrison, Nature 368. 
812-13 (1994)) or by covalently joining to the immunoglobulin coding sequence all or part of the 
coding sequence for a non-immunoglobulin polypeptide. Such a non-immunoglobulin 
5 polypeptide can be substituted for the constant domains of an antibody of the invention, or can 
be substituted for the variable domains of one antigen-combining site of an antibody of the 
invention to create a chimeric bivalent antibody. 

5.13.2 Humanized Antibodies 

1 0 The antibodies directed against the protein antigens of the invention can further comprise 

humanized antibodies or human antibodies. These antibodies are suitable for administration to 
humans without engendering an immune response by the human against the administered 
immunoglobulin. Humanized forms of antibodies are chimeric immunoglobulins, 
immunoglobulin chains or fragments thereof (such as Fv, Fab, Fab', F(ab r )2 or other antigen- 

15 binding subsequences of antibodies) that are principally comprised of the sequence of a human 
immunoglobulin, and contain minimal sequence derived from a non-human immunoglobulin. 
Humanization can be performed following the method of Winter and co-workers (Jones et al., 
Nature, 321:522-525 (1986); Riechmann et al., Nature, 332:323-327 (1988); Verhoeyen et al., 
Science, 239:1534-1536 (1988)), by substituting rodent CDRs or CDR sequences for the 

20 corresponding sequences of a human antibody. (See also U.S. Patent No. 5,225,539.) In some 
instances, Fv framework residues of the human immunoglobulin are replaced by corresponding 
non-human residues. Humanized antibodies can also comprise residues which are found neither 
in the recipient antibody nor in the imported CDR or framework sequences. In general, the 
humanized antibody will comprise substantially all of at least one, and typically two, variable 

25 domains, in which all or substantially all of the CDR regions correspond to those of a non-human 
immunoglobulin and all or substantially all of the framework regions are those of a human 
immunoglobulin consensus sequence. The humanized antibody optimally also will comprise at 
least a portion of an immunoglobulin constant region (Fc), typically that of a human 
immunoglobulin (Jones et al., 1986; Riechmann et al., 1988; and Presta, Curr. Op. Struct. Biol.. 

30 2:593-596(1992)). 



5.13.3 Human Antibodies 

Fully human antibodies relate to antibody molecules in which essentially the entire 
sequences of both the light chain and the heavy chain, including the CDRs, arise from human 
35 genes. Such antibodies are termed "human antibodies", or "fully human antibodies" herein. 
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Human monoclonal antibodies can be prepared by the trioma technique; the human B-cell 
hybridoma technique (see Kozbor, et al., 1983 Immunol Today 4: 72) and the EBV hybridoma 
technique to produce human monoclonal antibodies (see Cole, et aL, 1985 In: Monoclonal 
Antibodies and Cancer Therapy, Alan R. Liss, Inc., pp. 77-96). Human monoclonal 

5 antibodies may be utilized in the practice of the present invention and may be produced by using 
human hybridomas (see Cote, et al., 1983. Proc Natl Acad Sci USA 80: 2026-2030) or by 
transforming human B -cells with Epstein Ban Virus in vitro (see Cole, et al., 1985 In: 
Monoclonal Antibodies and Cancer Therapy, Alan R. Liss, Inc., pp. 77-96). 

In addition, human antibodies can also be produced using additional techniques, 

10 including phage display libraries (Hoogenboom and Winter, I Mol. Biol. . 222*381 (1991); 
Marks et al., J. Mol. BioL, 222:581 (1991)). Similarly, human antibodies can be made by 
introducing human immunoglobulin loci into transgenic animals, e.g., mice in which the 
endogenous immunoglobulin genes have been partially or completely inactivated. Upon 
challenge, human antibody production is observed, which closely resembles that seen in humans 

15 in all respects, including gene rearrangement, assembly, and antibody repertoire. This approach 
is described, for example, in U.S. Patent Nos. 5,545,807; 5,545,806; 5,569,825; 5,625,126; 
5,633,425; 5,661,016, and in Marks et al. (Bio/Technology 10, 779-783 (1992)); Lonberg et al. 
(Nature 368 856-859 (1994)); Morrison ( Nature 368, 812-13 (1994)); Fishwild et al,( Nature 
Biotechnology 14, 845-51 (1 996)): Neuherger (Nature Biotechnology 14. 826 (1996)); and 

20 Lonberg and Huszar (Intern. Rev. Immunol. 13 65-93 (1 995)). 

Human antibodies may additionally be produced using transgenic nonhuman animals 
which are modified so as to produce folly human antibodies rather than the animal's endogenous 
antibodies in response to challenge by an antigen. (See PCT publication WO94/02602). The 
endogenous genes encoding the heavy and light immunoglobulin chains in the nonhuman host 

25 have been incapacitated, and active loci encoding human heavy and light chain immunoglobulins 
are inserted into the host's genome. The human genes are incorporated, for example, using yeast 
artificial chromosomes containing the requisite human DNA segments. An animal which 
provides all the desired modifications is then obtained as progeny by crossbreeding intermediate 
transgenic animals containing fewer than the full complement of the modifications. The 

30 preferred embodiment of such a nonhuman animal is a mouse, and is termed the Xenomouse™ 
as disclosed in PCT publications WO 96/33735 and WO 96/34096. This animal produces B 
cells which secrete fully human immunoglobulins. The antibodies can be obtained directly from 
the animal after immunization with an immunogen of interest, as, for example, a preparation of a 
polyclonal antibody, or alternatively from immortalized B cells derived from the animal, such as 

35 hybridomas producing monoclonal antibodies. Additionally, the genes encoding the 



79 



WO 01/57188 



PCT/US01/03800 



immunoglobulins with human variable regions can be recovered and expressed to obtain the 
antibodies directly, or can be further modified to obtain analogs of antibodies such as, for 
example, single chain Fv molecules. 

An example of a method of producing a nonhuman host, exemplified as a mouse, lacking 
5 expression of an endogenous immunoglobulin heavy chain is disclosed in U.S. Patent No. 
5,939,598. It can be obtained by a method including deleting the J segment genes from at least 
one endogenous heavy chain locus in an embryonic stem cell to prevent rearrangement of the 
locus and to prevent formation of a transcript of a rearranged immunoglobulin heavy chain locus, 
the deletion being effected by a targeting vector containing a gene encoding a selectable marker; 

10 and producing from the embryonic stem cell a transgenic mouse whose somatic and germ cells 
contain the gene encoding the selectable marker. 

A method for producing an antibody of interest, such as a human antibody, is disclosed in 
U.S. Patent No. 5,916,771. It includes introducing an expression vector that contains a 
nucleotide sequence encoding a heavy chain into one mammalian host cell in culture, introducing 

15 an expression vector containing a nucleotide sequence encoding a light chain into another 

mammalian host cell, and fusing the two cells to form a hybrid cell. The hybrid cell expresses an, 
antibody containing the heavy chain and the light chain. 

In a further improvement on this procedure, a method for identifying a clinically relevant 
epitope on an immunogen, and a correlative method for selecting an antibody that binds 

20 immunospecifically to the relevant epitope with high affinity, are disclosed in PCT publication 
WO 99/53049. 

5.13.4 F ab Fragments and Single Chain Antibodies 

According to the invention, techniques can be adapted for the production of single-chain 
25 antibodies specific to an antigenic protein of the invention (see e.g., U.S. Patent No. 4,946,778), 
In addition, methods can be adapted for the construction of F a b expression libraries (see e.g., 
Huse, et al., 1989 Science 246: 1275-1281) to allow rapid and effective identification of 
monoclonal F ab fragments with the desired specificity for a protein or derivatives, fragments, 
analogs or homologs thereof. Antibody fragments that contain the idiotypes to a protein antigen 
30 may be produced by techniques known in the art including, but not limited to: (i) an.F^ 

fragment produced by pepsin digestion of an antibody molecule; (ii) an F ab fragment generated 
by reducing the disulfide bridges of an F (ab12 fragment; (iii) an F ab fragment generated by the 
treatment of the antibody molecule with papain and a reducing agent and (iv) F v fragments. 
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Bispecific antibodies are monoclonal, preferably human or humanized, antibodies that 
have binding specificities for at least two different antigens. In the present case, one of the 
binding specificities is for an antigenic protein of the invention. The second binding target is any 
other antigen, and advantageously is a cell-surface protein or receptor or receptor subunit. 

5 Methods for making bispecific antibodies are known in the art. Traditionally, the 

recombinant production of bispecific antibodies is based on the co-expression of two 
immunoglobulin heavy-chain/light-chain pairs, where the two heavy chains have different 
specificities (Milstein and Cuello, Nature, 305:537-539 (1983)). Because of the random 
assortment of immunoglobulin heavy and light chains, these hybridomas (quadromas) produce a 

10 potential mixture often different antibody molecules, of which only one has the correct 

bispecific structure. The purification of the correct molecule is usually accomplished by affinity 
chromatography steps. Similar procedures are disclosed in WO 93/08829, published 13 May 
1993, and in Traunecker et ai, 1991 EMBO J., 10:3655-3659. 

Antibody variable domains with the desired binding specificities (antibody-antigen 

15 combining sites) can be fused to immunoglobulin constant domain sequences. The fusion 

preferably is with an immunoglobulin heavy-chain constant domain, comprising at least part of 
the hinge, CH2, and CH3 regions. It is preferred to have the first heavy-chain constant region 
(CHI) containing the site necessary for light-chain binding present in at least one of the fusions. 
DNAs encoding the immunoglobulin heavy-chain fusions and, if desired, the immunoglobulin 

20 light chain, are inserted into separate expression vectors, and are co-transfected into a suitable 
host organism. For further details of generating bispecific antibodies see, for example, Suresh et 
al., Methods in Enzvmology, 121 :210 (1986). 

According to another approach described in WO 96/2701 1, the interface between a pair 
of antibody molecules can be engineered to maximize the percentage of heterodimers which are 

25 recovered from recombinant cell culture. The preferred interface comprises at least a part of the 
CH3 region of an antibody constant domain. In this method, one or more small amino acid side 
chains from the interface of the first antibody molecule arc replaced with larger side chains (e.g. 
tyrosine or tryptophan). Compensatory "cavities" of identical or similar size to the large side 
chain(s) are created on the interface of the second antibody molecule by replacing large amino 

30 acid side chains with smaller ones (e.g. alanine or threonine). This provides a mechanism for 
increasing the yield of the heterodimer over other unwanted end-products such as homodimers. 

Bispecific antibodies can be prepared as full length antibodies or antibody fragments (e.g. 
F(ab')2 bispecific antibodies). Techniques for generating bispecific antibodies from antibody 
fragments have been described in the literature. For example, bispecific antibodies can be 

35 prepared using chemical linkage. Brennan et al., Science 229:81 (1985) describe a procedure 



81 



WO 01/57188 



PCT/US01/03800 



wherein intact antibodies are proteolytically cleaved to generate F(ab')2 fragments. These 
fragments are reduced in the presence of the ditbiol complexing agent sodium arsenite to 
stabilize vicinal dithiols and prevent intermolecular disulfide formation. The Fab' fragments 
generated are then converted to thionitrobenzoate (TNB) derivatives. One of the Fab'-TNB 
5 derivatives is then reconverted to the Fab'-thiol by reduction with mercaptoethylamine and is 
mixed with an equimolar amount of the other Fab'-TNB derivative to form the bispecific 
antibody. The bispecific antibodies produced can be used as agents for the selective 
immobilization of enzymes. 

Additionally, Fab' fragments can be directly recovered from E. coli and chemically 

10 coupled to form bispecific antibodies. Shalaby et al., J.Exp. Med. 175:217-225 (1992) describe 
the production of a fully humanized bispecific antibody JP(ab')2 molecule. Each Fab 5 fragment 
was separately secreted from E. coli and subjected to directed chemical coupling in vitro to form 
the bispecific antibody. The bispecific antibody thus formed was able to bind to cells 
overexpressing the ErbB2 receptor and normal human T cells, as well as trigger the lytic activity 

1 5 of human cytotoxic lymphocytes against human breast tumor targets. 

Various techniques for making and isolating bispecific antibody fragments directly from 
recombinant cell culture have also been described. For example, bispecific antibodies have been S 
produced using leucine zippers. Kostelny etal., JJtomunoL 148(5): 1547-1 553 (1992). The 
leucine zipper peptides from the Fos and Jun proteins were linked to the Fab' portions of two 

20 different antibodies by gene fusion. The antibody homodimers were reduced at the hinge region 
to form monomers and then re-oxidized to form the antibody heterodimers. This method can 
also be utilized for the production of antibody homodimers. The "diabody" technology 
described by Hollinger et al, Proc. NatL Acad. Sci. USA 90:6444-6448 (1993) has provided an 
alternative mechanism for making bispecific antibody fragments. The fragments comprise a 

25 heavy-chain variable domain (V H ) connected to a light-chain variable domain (V L ) by a linker 
which is too short to allow pairing between the two domains on the same chain. Accordingly, 
the V H and V L domains of one fragment are forced to pair with the complementary V L and V H 
domains of another fragment, thereby forming two antigen-binding sites. Another strategy for 
making bispecific antibody fragments by the use of single-chain Fv (sFv) dimers has also been 

30 reported. See, Gruber et al., J. Immunol. 152:5368 (1994). 

Antibodies with more than two valencies are contemplated. For example, trispecific 
antibodies can be prepared. Tutt et al., J. Immunol. 147:60 (1991). 
Exemplary bispecific antibodies can bind to two different epitopes, at least one of which 
originates in the protein antigen of the invention. Alternatively, an anti-antigenic arm of an 

35 immunoglobulin molecule can be combined with an arm which binds to a triggering molecule on 
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a leukocyte such as a T-cell receptor molecule (e.g. CD2, CD3, CD28, or B7), or Fc receptors fox 
IgG(Fc R),suchasFc RI(CD64),Fc RII (CD32) and Fc RIII (CD 16) so as to focus 
cellular defense mechanisms to the cell expressing the particular antigen. Bispecific antibodies 
can also be used to direct cytotoxic agents to cells which express a particular antigen. These 
5 antibodies possess an antigen-binding arm and an arm which binds a cytotoxic agent or a 

radionuclide chelator, such as EOTUBE, DPTA, DOTA, or TETA. Another bispecific antibody 
of interest binds the protein antigen described herein and further binds tissue factor (TF). 

5.13.6 Heteroconjugate Antibodies 

1 0 Heteroconjugate antibodies are also within the scope of the present invention. 

Heteroconjugate antibodies are composed of two covalently joined antibodies. Such antibodies 
have, for example, been proposed to target immune system cells to unwanted cells (U.S. Patent 
No. 4,676,980), and for treatment of HTV infection (WO 91/00360; WO 92/200373; EP 03089). 
It is contemplated that the antibodies can be prepared in vitro using known methods in synthetic 

1 5 protein chemistry, including those involving crosslinking agents. For example, immunotoxins 
can be constructed using a disulfide exchange reaction or by forming a thioether bond. 
Examples of suitable reagents for this purpose include iminothiolate and methyl-4- 
mercaptobutyrimidate and those disclosed, for example, in U.S. Patent No. 4,676,980. 

20 5.13.7 Effector Function Engineering 

It can be desirable to modify the antibody of the invention with respect to effector function, so as 
to enhance, e.g.* the effectiveness of the antibody in treating cancer. For example, cysteine 
residue(s) can be introduced into the Fc region, thereby allowing interchain disulfide bond 
formation in this region. The homodimeric antibody thus generated can have improved 

25 internalization capability and/or increased complement-mediated cell killing and antibody- 
dependent cellular cytotoxicity (ADCC). See Caron et al., J. Exp Med., 176: 1191-1195 (1992) 
and Shopes, J. Immunol, 148; 2918-2922 (1992). Homodimeric antibodies with enhanced anti- 
tumor activity can also be prepared using heterobifunctional cross-linkers as described in Wolff 
et al. Cancer Research, 53: 2560-2565 (1993). Alternatively, an antibody can be engineered that 

30 has dual Fc regions and can thereby have enhanced complement lysis and ADCC capabilities. 
See Stevenson et al., Anti-Cancer Drug Design, 3: 219-230 (1989). 

5.13.8 Immunoconjugates 

The invention also pertains to immunoconjugates comprising an antibody conjugated to a 
35 cytotoxic agent such as a chemotherapeutic agent, toxin (e.g., an enzymatically active toxin of 
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bacterial, fungal, plant, or animal origin, or fragments thereof), or a radioactive isotope {i.e., a 
radioconjugate). 

Chemotherapeutic agents useful in the generation of such inununoconjugates have been 
described above. Enzymatically active toxins and fragments thereof that can be used include 
5 diphtheria A chain, nonbinding active fragments of diphtheria toxin, exotoxin A chain (from 
Pseudomonas aeruginosa), ricin A chain, abrin A chain, modeccin A chain, alpha-sarcin, 
Aleurites fordii proteins, dianthin proteins, Phytolaca americana proteins (PAPI, PAPII, and 
PAP-S), momordica charantia inhibitor, curcin, crotin, sapaonaria officinalis inhibitor, gelonin, 
mitogellin, restrictocin, phenomycin, enomycin, and the tricothecenes. A variety of 

10 radionuclides are available for the production of radioconjugated antibodies. Examples include 
2U Bi, ,31 I, l3 V 90 Y,aiid" 6 R e . 

Conjugates of the antibody and cjlotoxic agent are made using a variety of Afunctional 
protein-coupling agents such as N-succinimidyl-3-(2-pyridyldithiol) propionate (SPDP), 
iminothiolane (IT), Afunctional derivatives of imidoesters (such as dimethyl adipimidate HCL), 

15 active esters (such as disucciniraidyl suberate), aldehydes (such as glutareldehyde), bis-azido 
compounds (such as bis (p-azidobenzoyl) hexanediamine), bis-diazonium derivatives (such as , 
bis^diazoniumbenzoyl)-ethylenediamine), diisocyanates (such as tolyene 2,6-diisocyanate), 
and bis-active fluorine compounds (such as l 5 5-difluoro-2,4-dinitrobenzene). For example, a 
ricin immunotoxin can be prepared as described in Vitetta et al., Science, 238: 1098 (1987). 

20 Carbon- 1 4-labeled 1 -isothiocyanatobenzyl-3-methyldiethylene triaminepentaacetic acid (MX- 
DTPA) is an exemplary chelating agent for conjugation of radionucleotide to the antibody. See 
WO94/11026. 

In another embodiment, the antibody can be conjugated to a "receptor" (such 
streptavidin) for utilization in tumor pretargeting wherein the antibody-receptor conjugate is 
25 administered to the patient, followed by removal of unbound conjugate from the circulation 
using a clearing agent and then administration of a "iigand" (e.g., avidin) that is in turn 
conjugated to a cytotoxic agent 

4.14 COMPUTER READABLE SEQUENCES 

In one application of this embodiment, a nucleotide sequence of the present invention can 
be recorded on computer readable media. As used herein, "computer readable media" refers to 
any medium which can be read and accessed directly by a computer. Such media include, but 
are not limited to: magnetic storage media, such as floppy discs, hard disc storage medium, and 
magnetic tape; optical storage media such as CD-ROM; electrical storage media such as RAM 
and ROM; and hybrids of these categories such as magnetic/optical storage media, A skilled 
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artisan can readily appreciate how any of the presently known computer readable mediums can 
be used to create a manufacture comprising computer readable medium having recorded thereon 
a nucleotide sequence of the present invention. As used herein, ''recorded" refers to a process for 
storing information on computer readable medium. A skilled artisan can readily adopt any of the 
5 presently known methods for recording information on computer readable medium to generate 
manufactures comprising the nucleotide sequence information of the present invention. 

A variety of data storage structures are available to a skilled artisan for creating a 
computer readable medium having recorded thereon a nucleotide sequence of the present 
invention. The choice of the data storage structure will generally be based on the means chosen 
10 to access the stored information. In addition, a variety of data processor programs and formats 
can be used to store the nucleotide sequence information of the present invention on computer 
readable medium. The sequence information can be represented in a word processing text file, 
formatted in commercially-available software such as WordPerfect and Microsoft Word, or 
represented in the form of an ASCII file, stored in a database application, such as DB2, Sybase, 
1 5 Oracle, or the like. A skilled artisan can readily adapt any number of data processor stracturing 
formats (e.g. text file or database) in order to obtain computer readable medium having recorded 
thereon the nucleotide sequence information of the present invention. 

By providing any of the nucleotide sequences SEQ ID NO: 1-1350 or a representative 
fragment thereof; or a nucleotide sequence at least 95% identical to any of the nucleotide 
20 sequences of SEQ ID NO:1-1350 in computer readable form, a skilled artisan can routinely 
access the sequence information for a variety of purposes. Computer software is publicly 
available which allows a skilled artisan to access sequence information provided in a computer 
readable medium. The examples which follow demonstrate how software which implements the 
BLAST (Altschul et aL, J. MoL Biol. 215:403-410 (1990)) and BLAZE (Brutlag et aL, Comp. 
25 Chem. 17:203-207 (1993)) search algorithms on a Sybase system is used to identify open reading 
frames (ORFs) within a nucleic acid sequence. Such ORFs may be protein encoding fragments 
and may be useful in producing commercially important proteins such as enzymes used in 
fermentation reactions and in the production of commercially useful metabolites. 

As used herein, "a computer-based system" refers to the hardware means, software 
30 means, and data storage means used to analyze the nucleotide sequence information of the 

present invention. The rntnimnm hardware means of the computer-based systems of the present 
invention comprises a central processing unit (CPU), input means, output means, and data 
storage means. A skilled artisan can readily appreciate that any one of the currently available 
computer-based systems are suitable for use in the present invention. As stated above, the 
35 computer-based systems of the present invention comprise a data storage means having stored 
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therein a nucleotide sequence of the present invention and the necessary hardware means and 
software means for supporting and implementing a search means. As used herein, "data storage 



invention, or a memory access means which can access manufactures having recorded thereon 
5 the nucleotide sequence information of the present invention. 

As used herein, "search means" refers to one or more programs which are implemented 
on the computer-based system to compare a target sequence or target structural motif with the 
sequence information stored within the data storage means. Search means are used to identify 
fragments or regions of a known sequence which match a particular target sequence or target 

10 motif. A variety of known algorithms are disclosed publicly and a variety of commercially 
available software for conducting search means are and can be used in the computer-based 
systems of the present invention. Examples of such software includes, but is not limited to, 
Smith-Waterman, MacPattern (EMBL), BLASTN and BLASTA (NPOLYPEPTTDEIA). A 
skilled artisan can readily recognize that any one of the available algorithms or implementing 

15 software packages for conducting homology searches can be adapted for use in the present 
computer-based systems. As used herein, a "target sequence" can be any nucleic acid or amino 
acid sequence of six or more nucleotides or two or more amino acids. A skilled artisan can 
readily recognize that the longer a target sequence is, the less likely a target sequence will be 
present as a random occurrence in the database. The most preferred sequence length of a target 

20 sequence is from about 1 0 to 300 amino acids, more preferably from about 30 to 1 00 nucleotide 
residues. However, it is well recognized that searches for commercially important fragments, 
such as sequence fragments involved in gene expression and protein processing, may be of 
shorter length. 



25 selected sequence or combination of sequences in which the sequence(s) are chosen based on a 
three-dimensional configuration which is formed upon the folding of the target motif. There are 
a variety of target motifs known in the art. Protein target motifs include, but are not limited to, 
enzyme active sites and signal sequences. Nucleic acid target motifs include, but are not limited 
to, promoter sequences, hairpin structures and inducible expression elements (protein binding 

30 sequences). 

4.15 TRIPLE HELIX FORMATION 

In addition, the fragments of the present invention, as broadly described, can be used to 
control gene expression through triple helix formation or antisense DNA or RNA, both of which 
35 methods are based on the binding of a polynucleotide sequence to DNA or RNA. * 



means" refers to memory which can store nucleotide sequence information of the present 



As used herein, "a target structural motif," or "target motif," refers to any rationally 
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Polynucleotides suitable for use in these methods are preferably 20 to 40 bases in length and are 
designed to be complementary to a region of the gene involved in transcription (triple helix - see 
Lee et aL, Nucl. Acids Res. 6:3073 (1979); Cooney et aL, Science 15241:456 (1988); and Dervan 
et al., Science 251:1360 (1991)) or to the mKNA itself (antisense - Olmno, J. Neurochem. 

5 56:560 (1991); Oligodeoxynucleotides as Antisense Inhibitors of Gene Expression, CRC Press, 
Boca Raton, FL (1988)). Triple helix-formation optimally results in a shut-off of RNA 
transcription from DNA, while antisense RNA hybridization blocks translation of an mRNA 
molecule into polypeptide. Both techniques have been demonstrated to be effective in model 
systems. Information contained in the sequences of the present invention is necessary for the 

1 0 design of an antisense or triple helix oligonucleotide. 

4.16 DIAGNOSTIC ASSAYS AND KITS 

The present invention further provides methods to identify the presence or expression of 
one of the ORFs of the present invention, or homolog thereof, in a test sample, using a nucleic 

15 acid probe or antibodies of the present invention, optionally conjugated or otherwise associated 
with a suitable label. 

In general, methods for detecting a polynucleotide of the invention can comprise 
contacting a sample with a compound that binds to and forms a complex with the polynucleotide 
for a period sufficient to form the complex, and detecting the complex, so that if a complex is 

20 detected, a polynucleotide of the invention is detected in the sample. Such methods can also 
comprise contacting a sample under stringent hybridization conditions with nucleic acid primers 
that anneal to a polynucleotide of the invention under such conditions, and amplifying annealed 
polynucleotides, so that if a polynucleotide is amplified, a polynucleotide of the invention is 
detected in the sample. 

25 In general, methods for detecting a polypeptide of the invention can comprise contacting 

a sample with a compound that binds to and forms a complex with the polypeptide for a period 
sufficient to form the complex, and detecting the complex, so that if a complex is detected, a 
polypeptide of the invention is detected in the sample. 

In detail, such methods comprise incubating a test sample with one or more of the 

30 antibodies or one or more of the nucleic acid probes of the present invention and assaying for 
binding of the nucleic acid probes or antibodies to components within the test sample. 

Conditions for incubating a nucleic acid probe or antibody with a test sample vary. 
Incubation conditions depend on the format employed in the assay, the detection methods 
employed, and the type and nature of the nucleic acid probe or antibody used in the assay. One 

35 skilled in the art will recognize that any one of the commonly available hybridization, 
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amplification or immunological assay formats can readily be adapted to employ the nucleic acid 
probes or antibodies of the present invention. Examples of such assays can be found in Chard, 
T., An Introduction to Radioimmunoassay and Related Techniques, Elsevier Science Publishers, 
Amsterdam, The Netherlands (1986); Bullock, G.R. et aL, Techniques in Immunocytochemistry, 
5 Academic Press, Orlando, FL Vol. 1 (1982), Vol. 2 (1983), Vol. 3 (1985); Tijssen, P., Practice 
and Theory of immunoassays: Laboratory Techniques in Biochemistry and Molecular Biology, 
Elsevier Science Publishers, Amsterdam, The Netherlands (1985). The test samples of the 
present invention include cells, protein or membrane extracts of cells, or biological fluids such as 
sputum, blood, serum, plasma, or urine. The test sample used in the above-described method 
10 will vary based on the assay format, nature of the detection method and the tissues, cells or 
extracts used as the sample to be assayed. Methods for preparing protein extracts or membrane 
extracts of cells are well known in the art and can be readily be adapted in order to obtain a 
sample which is compatible with the system utilized. 



1 5 necessary reagents to carry out the assays of the present invention. Specifically, the invention 
provides a compartment kit to receive, in close confinement, one or more containers which 
comprises: (a) a first container comprising one of the probes or antibodies of the present 
invention; and (b) one or more other containers comprising one or more of the following: wash 
reagents, reagents capable of detecting presence of a bound probe or antibody. 

20 In detail, a compartment kit includes any kit in which reagents are contained in separate 

containers. Such containers include small glass containers, plastic containers or strips of plastic 
or paper. Such containers allows one to efficiently transfer reagents from one compartment to 
another compartment such that the samples and reagents are not cross-contaminated, and the 
agents or solutions of each container can be added in a quantitative fashion from one 

25 compartment to another. Such containers will include a container which will accept the test 
sample, a container which contains the antibodies used in the assay, containers which contain 
wash reagents (such as phosphate buffered saline, Tris-buffers, etc.), and containers which 
contain the reagents used to detect ihe bound antibody or probe. Types of detection reagents 
include labeled nucleic acid probes, labeled secondary antibodies, or in the alternative, if the 

30 primary antibody is labeled, the enzymatic, or antibody binding reagents which are capable of 
reacting with the labeled antibody. One skilled in the art will readily recognize that the disclosed 
probes and antibodies of the present invention can be readily incorporated into one of the 
established kit formats which are well known in the art. 



In another embodiment of the present invention, kits are provided which contain the 



35 4.17 MEDICAL IMAGING 
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The novel polypeptides and binding partners of the invention are useful in medical 
imaging of sites expressing the molecules of the invention (e.g., where the polypeptide of the 
invention is involved in the immune response, for imaging sites of inflammation or infection). 
See, e.g., Kunkel et at., US. Pat NO. 5,413,778. Such methods involve chemical attachment of 
5 a labeling or imaging agent, administration of the labeled polypeptide to a subject in a 

pharmaceutically acceptable carrier, and imaging the labeled polypeptide in vivo at the target 
site. 

4.18 SCREENING ASSAYS 
10 Using the isolated proteins and polynucleotides of the invention, the present invention 

further provides methods of obtaining and identifying agents which bind to a polypeptide 

encoded by an ORF corresponding to any of the nucleotide sequences set forth in SEQ ID NO:l- 

1350, or bind to a specific domain of the polypeptide encoded by the nucleic acid. In detail, said 

method comprises the steps of: 
15 (a) contacting an agent with an isolated protein encoded by an ORF of the present 

invention, or nucleic acid of the invention; and 

(b) determining whether the agent binds to said protein or said nucleic acid. 
In general, therefore, such methods for identifying compounds that bind to a 

polynucleotide of the invention can comprise contacting a compound with a polynucleotide of 
20 the invention for a time sufficient to form a polynucleotide/compound complex, and detecting 

the complex, so that if a polynucleotide/compound complex is detected, a compound that binds 

to a polynucleotide of the invention is identified. 

Likewise, in general, therefore, such methods for identifying compounds that bind to a 

polypeptide of the invention can comprise contacting a compound with a polypeptide of the 
25 invention for a time sufficient to form a polypeptide/compound complex, and detecting the 

complex, so that if a polypeptide/compound complex is detected, a compound that binds to a 

polynucleotide of the invention is identified. 

Methods for identifying compounds that bind to a polypeptide of the invention can also 

comprise contacting a compound with a polypeptide of the invention in a cell for a time 
30 sufficient to form a polypeptide/compound complex, wherein the complex drives expression of a 

receptor gene sequence in the cell, and detecting the complex by detecting reporter gene 

sequence expression so that if a polypeptide/compound complex is detected, a compound that 

binds a polypeptide of the invention is identified. 

Compounds identified via such methods can include compounds which modulate the 
35 activity of a polypeptide of the invention (that is, increase or decrease its activity, relative to 
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activity observed in the absence of the compound). Alternatively, compounds identified via such 
methods can include compounds which modulate the expression of a polynucleotide of the 
invention (that is, increase or decrease expression relative to expression levels observed in the 
absence of the compound). Compounds, such as compounds identified via the methods of the 
5 invention, can be tested using standard assays well known to those of skill in the art for their 
ability to modulate activity/expression. 

The agents screened in the above assay can be, but are not limited to, peptides, 
carbohydrates, vitamin derivatives, or other pharmaceutical agents. The agents can be selected 
and screened at random or rationally selected or designed using protein modeling techniques. 

10 For random screening, agents such as peptides, carbohydrates, pharmaceutical agents and 

the like are selected at random and are assayed for their ability to bind to the protein encoded by 
the ORF of the present invention. Alternatively, agents may be rationally selected or designed. 
As used herein, an agent is said to be "rationally selected or designed 1 ' when the agent is chosen 
based on the configuration of the particular protein.' For example, one skilled in the art can 

1 5 readily adapt currently available procedures to generate peptides, pharmaceutical agents and the 
like, capable of binding to a specific peptide sequence, in order to generate rationally designed 
antipeptide peptides, for example see Hurby et aL, Application of Synthetic Peptides: Antisense 
Peptides," In Synthetic Peptides, A User's Guide, W.H. Freeman, NY (1992), pp. 289-307, and 
Kaspczak et aL, Biochemistry 28:9230-8 (1989), or pharmaceutical agents, or the like. 

20 In addition to the foregoing, one class of agents of the present invention, as broadly 

described, can be used to control gene expression through binding to one of the ORFs or EMFs 
of the present invention. As described above, such agents can be randomly screened or 
rationally designed/selected. Targeting the ORF or EMF allows a skilled artisan to design 
sequence specific or element specific agents, modulating the expression of either a single ORF or 

25 multiple ORFs which rely on the same EMF for expression control. One class of DNA binding 
, agents are agents which contain base residues which hybridize or form a triple helix formation 
by binding to DNA or RNA. Such agents can be based on the classic phosphodiester, 
ribonucleic acid backbone, or can be a variety of sulfhydryl or polymeric derivatives which have 
base attachment capacity. 

30 Agents suitable for use in these methods preferably contain 20 to 40 bases and are 

designed to be complementary to a region of the gene involved in transcription (triple helix - see 
Lee et al., Nucl. Acids Res. 6:3073 (1979); Cooney et aL, Science 241:456 (1988); and Dervan et 
aL, Science 251 : 1360 (1 991)) or to the mRNA itself (antisense - Okano, J. Ncurochem. 56:560 
(1991); Oligodeoxynucleotides as Antisense Inhibitors of Gene Expression, CRC Press, Boca 

35 Raton, FL (1988)). Triple helix-formation optimally results in a shut-off of RNA transcription 
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from DNA, while antisense RNA hybridization blocks translation of an mRNA molecule into 
polypeptide. Both techniques have been demonstrated to be effective in model systems. 
Information contained in the sequences of the present invention is necessary for the design of an 
antisense or triple helix oligonucleotide and other DNA binding agents. 
5 Agents which bind to a protein encoded by one of the ORFs of the present invention can 

be used as a diagnostic agent Agents which bind to a protein encoded by one of the ORFs of the 
present invention can be formulated using known techniques to generate a pharmaceutical 
composition. 

10 4.19 USE OF NUCLEIC ACIDS AS PROBES 

Another aspect of the subject invention is to provide for polypeptide-specific nucleic acid 
hybridization probes capable of hybridizing with naturally occurring nucleotide sequences. The 
hybridization probes of the subject invention may be derived from any of the nucleotide 
sequences SEQ ID NO:1-1350. Because the corresponding gene is only expressed in a limited 
1 5 number of tissues, a hybridization probe derived from of any of the nucleotide sequences SEQ 
ID NO:1-1350 can be used as an indicator of the presence of RNA of cell type of such a tissue in 
a sample. 

Any suitable hybridization technique can be employed, such as, for example, in situ 
hybridization. PCR as described inUS Patents Nos. 4,683,195 and 4,965,188 provides 

20 additional uses for oligonucleotides based upon the nucleotide sequences. Such probes used in 
PCR may be of recombinant origin, may be chemically synthesized, or a mixture of both. The 
probe will comprise a discrete nucleotide sequence for the detection of identical sequences or a 
degenerate pool of possible sequences for identification of closely related genomic sequences. 
Other means for producing specific hybridization probes for nucleic acids include the 

25 cloning of nucleic acid sequences into vectors for the production of mRNA probes. Such vectors 
arc known in the art and are commercially available and may be used to synthesize RNA probes 
in vitro by means of the addition of the appropriate RNA polymerase as T7 or SP6 RNA 
polymerase and the appropriate radioactively labeled nucleotides. The nucleotide sequences may 
be used to construct hybridization probes for mapping their respective genomic sequences. The 

30 nucleotide sequence provided herein may be mapped to a chromosome or specific regions of a 
chromosome using well known genetic and/or chromosomal mapping techniques. These 
techniques include in situ hybridization, linkage analysis against known chromosomal markers, 
hybridization screening with libraries or flow-sorted chromosomal preparations specific to 
known chromosomes, and the like. The technique of fluorescent in situ hybridization of 
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chromosome spreads has been described, among other places, in Verma et al (1988) Human 
Chromosomes: A Manual of Basic Techniques, Pergamon Press, New York NY. 

Fluorescent in situ hybridization of chromosomal preparations and other physical 
chromosome mapping techniques may be correlated with additional genetic map data. Examples 
of genetic map data can be found in the 1994 Genome Issue of Science (265:1981f). Correlation 
between the location of a nucleic acid on a physical chromosomal map and a specific disease (or 
predisposition to a specific disease) may help delimit the region of DNA associated with that 
genetic disease. The nucleotide sequences of the subject invention may be used to detect 
differences in gene sequences between normal, carrier or affected individuals. 

420 PREPARATION OF SUPPORT BOUND OLIGONUCLEOTIDES 

Oligonucleotides, i.e., small nucleic acid segments, may be readily prepared by, for 
example, directly synthesizing the oligonucleotide by chemical means, as is commonly practiced 
using an automated oligonucleotide synthesizer. 

Support bound oligonucleotides may be prepared by any of the methods known to those of 
skill in the art using any suitable support such as glass, polystyrene or Teflon. One strategy is to 
precisely spot oligonucleotides synthesizedby standard synthesizers. Immobilizationcan be 
achieved using passive adsorption (Inouye & Hondo, (1990) J. Clin. Microbiol. 28(6) 1469-72); 
using UV light (Nagatae/ aL, 1985; Dahlen et al, 1987; Morrissey & Collins, (1989) Mol. Cell 
Probes3(2) 1 89-207) or by covalent binding of base modified DNA (Keller et aL, 1988; 1989); all 
references being specifically incorporated herein. 

Another strategy that may be employed is the use of the strong biotm-streptavidin 
interaction as a linker. For example, Broude et aL (1994) Proc. Natl. Acad. Sci. USA 91(8) 3072-6, 
describe the use of biotinylatedprobes, although these are duplex probes, that are immobilized on 
streptavidin-coated magnetic beads. Streptavidin-coated beads may be purchased from Dynal, 
Oslo. Of course, this same linking chemistry is applicable to coating any surface with streptavidin. 
Biotinylated probes may be purchased from various sources, such as, e.g., Operon Technologies 
(Alameda, CA). 

Nunc Laboratories (Naperville, IL) is also selling suitable material that could be used. Nunc 
Laboratories have developed a method by which DNA can be covalently bound to the microwell 
surface termed Covalink NHL CovaLink NH is a polystyrene surface grafted with secondary amino 
groups (>NH) that serve as bridge-heads for further covalent coupling. CovaLink Modules may be 
purchased from Nunc Laboratories. DNA molecules may be bound to CovaLink exclusively at the 
5'-end by a phosphoramidatebond, allowing immobilization of more than 1 pmol of DNA 
(Rasmussen*/*/, (1 991) Anal. Biochem. 198(1)13842). 
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The use of CovaLinkNH strips for covalent binding of DNA molecules at the 5'-end has 
been described (Rasmussenetal., (1991). In this technology, a phosphoramidatebond is erqployed 
(Chu et al., (1983) Nucleic Acids Res. 1 1(8) 651 3-29). Ibis is beneficial as immobilizationusing 
only a single covalent bond is preferred. The phosphoramidate bond joins the DNA to the 

5 CovaLink NH secondary amino groups that are positioned at the end of spacer arms covalently 
grafted onto the polystyrene surface through a 2 nm long spacer ana To link an oligonucleotide to 
CovaLinkNH via an phosphoramidatebond, the oligonucleotide terminus must have a 5 r -end 
phosphate group. It is, perhaps, even possible for biotin to be covalently bound to CovaLink and 
then streptavidinused to bind the probes. 

10 More specifically,the linkage method includes dissolving DNA in water (7.5 ng/ul) and 

denaturing for 10 min. at 95°C and cooling on ice for 10 min. Ice-cold 0. 1 M 1-methylimidazole, 
pH 7.0 (1 -Melm7) 9 is then added to a final concentration of 1 0 mM 1-Melm 7 . A ss DNA solution is 
then dispensed into CovaLinkNH strips (75 ul/well) standing on ice. 

Carbodiimide 0.2 M l^yl-3<3-dimethylaminopropyl)-carbodiimide (EDC), dissolved in 

15 10 mM 1-Melm 7 , is made fresh and 25 ul added per well. The strips are incubated for 5 hours at 
50°C. After incubation the strips are washed using, e.g. , Nunc-Immuno Wash; first the wells are 
washed 3 times, then they are soaked with washing solution for 5 min., and finally they are washed 
3 times (where in the washing solution is 0.4 N NaOH, 0.25% SDS heated to 50°C). 

It is contemplated that a further suitable method for use with the present invention is that 

20 describedin PCT Patent Application WO 90/03382 (Southera& Maskos), incorporated herein by 
reference. This method of preparing an oligonucleotide bound to a support involves attaching a 
nucleoside 3 f -reagent through the phosphate group by a covalent phosphodiester link to aliphatic 
hydroxyl groups carried by the support. The oligonucleotide is then synthesized on the supported 
nucleoside and protecting groups removed from the synthetic oligonucleotide chain under standard 

25 conditions that do not cleave the oligonucleotide from the support Suitable reagents include 
nucleoside phosphoramidite and nucleoside hydrogen phosphorate. 

An on-chip strategy for the preparation of DNA probe for the preparation of DNA probe 
arrays may be employed. For example, addressable laser-activated photodeprotectionmay be 
employed in the chemical synthesis of oligonucleotides directly on a glass surface, as described by 

30 Fodoref a/. (1991)Science251(4995)767-73,incorporatedhereinbyreference. Probesmay also 
be immobilized on nylon supports as described by Van Ness et al (1991) Nucleic Acids Res. 
19(12) 3345-50; or linked to Teflon using the method of Duncan & Cavalier (1988) Anal. Biochem. 
169(1) 104-8; all references being specifically incorporated herein. 
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To link an oligonucleotide to a nylon support, as described by Van Ness et al (1991), 
requires activation of the nylon surface via alkylation and selective activation of the 5-amine of 
oligonucleotides with cyanuric chloride. 

One particular way to prepare support bound oligonucleotides is to utilize the 
5 light-generated synthesis described by Pease et al, (1994) PNAS USA 91(1 1) 5022-6, incorporated 
herein by reference). These authors used current photolithographictechniques to generate arrays of 
immobilized oligonucleotideprobes (DNA chips). These methods, in which light is used to direct 
the synthesis of oligonucleotide probes in high-density, miniaturized arrays, utilize photolatdle 
5-protected AT-acyWeoxynucleosidephosphoramidites, surface linker chemistry and versatile 
10 combinatorial synthesis strategies. A matrix of 256 spatially defined oligonucleotideprobes may be 
generated in this manner. 

421 PREPARATION OF NUCLEIC ACID FRAGMENTS 

The nucleic acids may be obtained from any appropriate source, such as cDNAs, genomic 
DNA, chromosomal DNA, microdissected chromosome bands, cosmid or YAC inserts, and KNA, 
15 includingmRNA without any amplification steps. For example, Sambrookef a/. (1989) describes 
three protocols for the isolation of high molecular weight DNA from mammalian cells (p. 
9,14-9,23). 

DNA fragments may be prepared as clones in Ml 3 , plasmid or lambda vectors and/or 
prepared directly from genomic DNA or cDNA by PCR or other amplificationmethods. Samples 

20 may be prepared or dispensed in multiwell plates. About 100-1000 ng of DNA samples may be 
prepared in 2-500 ml of final volume. 

The nucleic acids would then be fragmented by any of the methods known to those of skill 
in the art including, for example, using restriction enzymes as described at 924-9.28 of Sambrook et 
al (1 989), shearing by ultrasound and NaOH treatment 

25 Low pressure shearing is also appropriate, as described by Schriefer et al (1990) Nucleic 

Acids Res. 1 8(24) 7455-6, incorporatedherein by reference). In this method, DNA samples are 
passed through a small French pressure cell at a variety of low to intermediate pressures. A lever 
device allows controlled application of low to intermediate pressures to the cell. The results of 
these studies indicate that low-pressure shearing is a useful alternative to sonic and enzymatic DNA 

30 fragmentationmethods. 

One particularly suitable way for fragmenting DNA is contemplated to be that using the two 
base recognition endonuclease, CWJI, described by Fitzgerald et al (1 992) Nucleic Acids Res. 
20(14) 3753-62. These authors described an approach for the rapid fragmentation and fractionation 
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of DNA into particular sizes that they contemplated to be suitable for shotgun cloning and 
sequencing. 

The restriction endonuclease Cvi JI normally cleaves the recognition sequence PuGCPy 
between the G and C to leave blunt ends. Atypical reaction conditions, which alter the specificity of 

5 tibis enzyme (Cv/JI**), yield a quasi-random distribution of DNA fragments form the small 
molecule pUC19 (2688 base pairs). Fitzgerald et al (1992) quantitatively evaluated the 
randomness of this fragmentation strategy, using a CviJI** digest of pUC19 that was size 
fractionatedby a rapid gel filtration method and directly ligated, without end repair, to a lac Z minus 
M13 cloning vector. Sequence analysis of 76 clones showed that CvfJI** restricts pyGCPy and 

1 0 PuGCPu, in addition to PuGCPy sites, and that new sequence data is accumulated at a rate 
consistent with random fragmentation. 

As reported in the literature, advantages of this approach compared to sonication and 
agarose gel fractionation include: smaller amounts of DNA are required (0.2-0.5 ug instead of 2-5 
ug); and fewer steps are involved (no preligation, end repair, chemical extraction, or agarose gel 

1 5 electrophoresis and elution are needed 

Irrespective of the manner in which the nucleic acid fragments are obtained or prepared, it is 
important to denature the DNA to give single stranded pieces available for hybridization. This is 
achieved by incubating the DNA solution for 2-5 minutes at 80-90°C. The solution is then cooled 
quickly to 2°C to prevent renaturationof the DNA fragments before they are contacted with the 

20 chip. Phosphate groups must also be removed from genomic DNA by methods known in the art. 

4.22 PREPARATION OF DNA ARRAYS 

Arrays may be prepared by spotting DNA samples on a support such as a nylon membrane. 
Spotting may be performed by using arrays of metal pins (the positions of which correspond to an 
array of wells in a microliter plate) to repeated by transfer of about 20 nl of a DNA solution to a 

25 nylon membrane. By offset printing, a density of dots higher than the density of the wells is 

achieved. One to 25 dots may be accommodated in 1 mm 2 , depending on the type of label used. By 
avoiding spotting in some preselectednumber of rows and columns, separate subsets (subarrays) 
may be formed. Samples in one subarray may be the same genomic segment of DNA (or the same 
gene) from different individuals, or may be different, overlapped genomic clones. Each of the 

30 subarrays may represent replica spotting of the same samples. In one example, a selected gene 

segment may be amplified from 64 patients. For each patient, the amplified gene segment may be in 
one 96-well plate (all 96 wells containing the same sample). A plate for each of the 64 patients is 
prepared. By using a 96-pin device, all samples may be spotted on one 8 x 12 cm membrane. 
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Sufcanays may contain 64 samples, one from each patient Where the 96 subarrays are identical, the 
dot span may be 1 mm 2 and there may be a 1 mm space between subarrays. 

Another approach is to use membranes or plates (available fom NUNC, Naperville, Illinois) 
which may be partitioned by physical spacers e.g. a plastic grid molded over the membrane, the grid 
5 being similar to the sort of membrane applied to the bottom of multiwell plates, or hydrophobic 
strips. A fixed physical spacer is not preferred for imaging by exposure to flat phosphor-storage 
screens or x-ray films. 

The present invention is illustrated in the following examples. Upon considerationof the 
• present disclosure, one of skill in the art will appreciate that many other embodiments and variations 
1 0 may be made in the scope of the present invention. Accordingly, it is intended that the broader 
aspects of the present invention not be limited to the disclosure of the following examples. The 
present invention is not to be limited in scope by the exemplified embodiments which arc intended 
as illustrations of single aspects of the invention, and compositions and methods which are 
functionally equivalent are within the scope of the invention. Indeed, numerous modifications and 
1 5 variations in the practice of the invention are expected to occur to those skilled in the art upon 
considerationof the present preferred embodiments. Consequently, the only limitations which 
should be placed upon the scope of the invention are those which appear in the appended claims. • 

All references cited within the body of the instant specification are hereby incorporated by 
reference in their entirety. 

20 5.0 EXAMPLES 

5.1 EXAMPLE 1 

Novel Nucleic A cid Sequences Obtained From Various Libraries 
A plurality of novel nucleic acids were obtained from cDNA libraries prepared from various 
human tissues and in some cases isolated from a genomic library derived from human chromosome 
25 using standardPCR, SBH sequence signature analysis and Sanger sequencing techniques. The 
inserts of the library were amplified with PCR using primers specific for the vector sequences 
which flank the inserts. Clones from cDNA libraries were spotted on nylon membrane filters and 
screened with oligomicleotideprobes (e.g. , 7-mers) to obtain signature sequences. The clones were 
clustered into groups of similar or identical sequences. Representative clones were selectedfor 
30 sequencing. 

In some cases, the 5' sequence of the amplified inserts was then deduced using a typical 
Sanger sequencing protocol. PCR products were purified and subjected to fluorescent dye 
terminator cycle sequencing. Single pass gel sequencing was done using a 377 Applied Biosystems 
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(ABI) sequencer to obtain the novel nucleic acid sequences. In some cases RACE (Random 
Amplification of cDNA Ends) was performed to further extend Resequence in the 5' direction. 

5.2 EXAMPLE 2 

5 Novel Contigs 

The novel contigs of the invention were assembled from sequences that were obtained from 
a cDNA library by methods described in Example 1 above, and in some cases sequences obtained 
from one or more public databases. The sequences for the resulting nucleic acid contigs are 
designated as SEQ ID NO: 1 -1350 and are provided in the attached Sequence Listing. The contigs 

10 were assembled using an EST sequence as a seed. Then a recursive algorithm was used to extend 
the seed EST into an extended assemblage,by pulling additional sequences from different databases 
(te. 9 Hyseq's database containing EST sequences, dbEST version 1 14, gb pri 1 14, and UniGene 
version 1 01) that belong to this assemblage. The algorithm terminated when there was no 
additional sequences from the above databases that would extend the assemblage. Inclusion of 

15 component sequences into the assemblage was based on a BLASTNhit to the extending 
assemblage with BLAST score greater than 300 and percent identity greater than 95%. 

Table 3 sets forth the novel predictedpolypeptides (including proteins) encoded by the 
novel polynucleotides (SEQ ID NO: 1 89-282) of the present invention, and their corresponding 
nucleotidelocationstoeachofSEQIDNO: 189-282. Table 3 also indicates the method by which 

20 the polypeptide was predicted. Method A refers to a polypeptide obtained by using a software 

program called FASTY (available from htt p^/fasta.bio ch Airginia.edu) which selects a polypeptide 
based on a comparison of the translated novel polynucleotide to known polynucleotides (W.R. 
Pearson, Methods in Enzymology, 183:63-98 (1990), herein incorporatedby reference). Method B 
refers to a polypeptide obtained by using a software program called GenScan for human/vertebrate 

25 sequences(availablefrom Stanford University, Office of Technology Licensing) that predicts the 
polypeptide based on a probabilistic model of gene structure/compositionalproperttes (C. Burge 
andS.Karlin,! Mol.BioL, 268:78-94 (1997), incorporatedherein by reference). MethodCrefers 
to a polypeptide obtained by using a Hyseq proprietary software program that translates the novel 
polynucleotide and its complementary strand into six possible amino acid sequences (forward and 

30 reverse frames) and chooses the polypeptide with the longest open reading frame. 

The nearest neighbor results for SEQ ID NO: 1-1350 were obtained by a BLAST? 
version 2.0al 1 9MP-WashU search against Genpept release 120 and Geneseq database October 
12, 2000, update 21 (Derwent), using BLAST algorithm. The nearest neighbor result showed the 
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closest homologue for SEQ ID NO:1-1350. The nearest neighbor results for SEQ ID NO: 1- 
1 350 are shown in Table 2 below. 

Tables 1,2 and 3 follow. Table 1 shows the various tissue sources of SEQ ID NO: 1-1350. 
Table2showsthenearestneighborresultfortheassembledcontig. The nearest neighbor result 
shows the closest homolog with an identifiable function for each assemblage. Table 3 contains the 
start and stop nucleotides for the translated amino acid sequence for which each assemblage 
encodes. Table 3 also provides a correlationbetweenthe amino acid sequences set forth in the 
Sequence Listing, the nucleotide sequences set forth in the Sequence Listing and the SEQ ID NO. ii 
USSN 09/496,914. 
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TABLE 1 



Tissue Origin . 

adult brain < 


RNA Source 
3IBCO 


flyseq Library Name 
AB3001 


5EQIDNOS: | 
HI 151 188 215 662-665 877 91U 927 
9761233 1319 


adult brain 


QIBCO 


ABD003 


4149 74 101 111 120 132 141-142 iM 
217225 238 271 317 404 446 469503 
513-514 535 550 564 573 666-669798 
898 910 927 976 1067 1083 1085 1178 
1254 


adult brain 


Clontech 


ABR001 
ABR006 


39 216 238 327 356 535 927 1056 1 l*i 
1178-1180 1199 1251 J 
74 61194910341136 


adult brain 
adult brain 

adult brain 


Clontech 
Clontech 

Clontech 


ABR008 j 
ABROll 


14 32 41 61 81 86 89 120 132 138 145 1 
147 1 88 197 208 225 227-239 250 300- 
303 312 316 328-331 340 357-362374 
380 3 84-391 408 414 446 448 464-467 
483 488 495-496 505 5 12 521 535 550 
566 571 577 585 590 594 598 634641 
658 666 683 725 742 764 767 786 801 
805 810 823 826 829 831 836 841 887- 
923 927 934 943 950-95 1 963 976 995 
1000-1001 1006 1026 1034 1048 1U37- 
1067 1086 1088 1090 1118 1120 1122- 
1128 1142 1162 1181-1192 119? 1^4 
1218-1219 1225 1232 1253 1267 1271- 

1306 1342 1347 1349-1350 J 

49238 1219 1 
74238 1 


adult brain 
adult brain 
adult brain 


BioChain 
Invitrogen 
Invitrogen | 


ABR012 
ABR013 
ABT004 


868 1268 | 

49 117 138 191 217 252 291 305 535 I 
566 596 663 670 746 798 816-819 876 
892 898 922 943 963 1034-1036 1 121 


cultured 
preadipocytes 


Strategene 


ADP001 


j 41 74 101 138 211 238 304 537 58* 
740 798 883 943 976 1067 


adrenal gland 


Clontech 


ADR002 


49 74 101 111 120127 151215 238 
240-247 316 330 363-364 404 414 534- 
535 833 924-940 950 963 976 1001 
1003 1067-10701118 11561193-1200 

1 1325 


adult heart 


GIBCO 


AHR001 


1 38 49 71-7274-77 79 92 99101 111 ] 
118 129 132 138 151 158-163 182 195- 
203 215 217 238 264 269 353 384 398 
408 434-439 446 504 5 12-513 5 19 537 
562-573 577 611-614 616-619 658 661 
671-672 722 734 757-773 815 828-835 
874 891 898 919 926-927 976 988 
1021 1037 1041 1062 1067 1071 108U 
1083 1093 1122 1131 1185 1201 1254 
13081331 1335 


adult kidney 


G1BCO 


AKD001 


+41 49 51 71-74 78-85 94 100-101 103- | 
107 111 119-120138 151 157 215 217- 
21 8 238 250 264 294 304 384 404 440 
446 454 477 504-505 509 514 518-519 
535 537 564 574-583 620-627 639 653 
673-675 705 753 789 831 844 851 859 
877 909 918 927 956 963 976 1067 
1074 1083 1095 1178 1302 1331 1335 


adult kidney 
adult lung 


Invitrogen 
GIBCO 


AKT002 
1 ALG001 


11-12 41 49 111-112 215-217 294 316 
446 487 564 575 844 868 910 927 976 
1 1116 

1 8101 111 151 187 402 446 49U 514 
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Tissue Origin 


KNA Source 


Hyseq Library Name 


SEQIDNOS: 








^12 ^"1*7 KA c c/in coa coo <rv> «\a a 

jioZjf D*o 549 580 582 592 594 634 
640 65 1-652 676-678 725 85 1 873 91 8 
952976 1042 1067 1076 1083 1152 


lymph node 


Clontech 


AT NOA1 


0 111 111 i ;i ion ion loonir cnn 

o 11 J i/i 1M 180-182 188215 537 

54^ ^40 /iM AOQ £120 TOO Q(\A Q1f\ QCQ 

jhj o<*y ojI O/7-Ooz loy o04-olU ooo 
873 927952 976 1042 10591335 


young liver 


GIBCO 


ALV001 


8 64 79 111 186 215-216 238 446 514 
5 19 537 564 653 683-684 698 753 798 

813 833 S4G 858 927 976 1C33-1C39 
1051 1085 1224 1245 1256 


adult liver 


Invitrogen 


ALV002 


40 71 292-293 305 384 468-469 496 
505 657 675 714 753 832 844 941-942 
9761040107612561293 


adult liver 


Clontech 


ALV003 


976 


adult ovary 


Invitrogen 


AOV001 


8 32 36 38 41 49 51 71 74 79-80 101 
104 111 120 122-125 138 140 143-149 
151 188-190207-212 215-217238 264 
3 16 384 409 440 445-446 496 504 5 12 

C 1 A < 1 0 'CIO Cic an c At\ cr/v t? /~ a err 

514 518-519 535 537 549-550 564 566 

571 580 582 600 618 638 657 667681 

685-697 699 705 722 735-744 761 771 

815 833 842-865 868 875-876 918 926- 

927 950 952 963 976 3023 1042 1048 

1051 1059 1072 1076 108311171120 

1124 1131 11441174 12241268 1331 
i iis 

1553 


adult placenta 


Clontech 


APL001 


102 217238 537 641 700 


placenta 


UJrlUUgCll 


api ft no 


oo 5 851 1048 


adult spleen 


GIBCO 


ASP001 


8 45 74 111 132 140 151 185 217238 
294 414 446 477 504 514 534 545 549 
592 722 873 883 952 976 1041-1042 
1083 1093-1094 1152 1224 


testis 


GIBCO 


ATS001 


72 107 111 U3 326 140 151 183 215 
238 446 497 537 642 701-706 811 877 
927 962 976 1083 1117 1131 


adult bladder 


Invitrogen 


BLD001 


41 151 191402-405409414496 545 ' 
592 607706 873 952 1178 1329-1335 


bone marrow 


Clontech 


BMD001 


8 58-62 65-68 74 79 108 111 116 137 
147 151 164-174213-215238 305-307 
374 404 446 460 466 516 519 534 538- 
541 544-546 549-554 566 584 586 592 
596 607 610 628-629 643-645 652 707- 

*7AO nn A *70ft OAA occ on orro nin 

/Uo / /4-/89 844 866-871 873 919 927 
952 963 976 998 1034 1042 1064 1083 
1085 1120 1132 1152 1225 1229 1268 
iift7 I'll h 


bone marrow 


Clontech 


BMD002 


6 8 37-3852 7477 105111 329 132 
210317510-511 545 549581 598628 
638 724 766 789 844 860 868 873 919 
927 952 963 968 976 1042 1 1 1 1 ! 141 

1 1/SA 1 1£1 IOOO 10/CiC IIAC. 
llOU-1 10 1 lLzy l/OO 1J46 


bone marrow 


Clontech ! 


BMD004 


111 238 282 549 1083 


adult colon 


Invitrogen 


CLN001 


52 260 264 299 494 536 545 564 592~ 
844 873 877 952 976 1042 1 152 1268 
1336-1337 


adult cervix 


BioChain 


CVX001 


49 51429 132 151 205 207 238 332- 
335 365-367 392-401 440 466 470-471 
5 1 8 537 597 629 832 877 927 976 1006 
1085 1117 1129-11341192 1202-1205 
1219 1309-1328 


diaphragm 


BioChain 


DIA002 


74976 1083 



100 
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Tissue Origin 



endothelial cells 



RNA Source 



Hyseq Library Name 
EDT001 



SEQIPNOS: 



32 40-41 49 74 79 101 111 120 YS1 
138 151 204-206 215-217 238 269 316 
414 433 505 510 513 550 555 580 582 
596 675 722 745 798 814 836-841 851 
918 976 1041 1043 1073 1083 1131 
1331 



Genomic clones 
from the short arm 
of chromosomes 



Genomic DNA 
from Genetic 
Research 



EPM001 



525-532927 



Genomic clones 
from the short arm 
of chromosome 8 



Genomic DNA 
from Genetic 
Research 



EPM003 



Genomic clones 
from the short arm 
of chromosome 8 



Genomic clones 
from the short arm 
of chromosome 8 



Genomic DNA 
from Genetic 

Research 

Genomic DNA 
from Genetic 
Research 



EPM004 



EPM005 



47525 



52592T 



531 



esophagus 



BioChain 



ESO002 



fetal brain 



Clontech 



FBR001 



74 138 238 



441-442 927 



fetal brain 



Clontech 



fetal brain 



Clontech 



fetal brain" 



FBR004 
FBR006 



Clontech 



FBRs03 



215 893 927 1001 



48 61 101 120 132 138 140 147 208 
225 271 317 319 336 359 368 405-414 
519 550 571 594 686 715 722 764 824 
829 836 859 909 927 943 947 963 1057 
1067-1068 1104 1135-1140 1162 1206- 
1207 1235 1268 1288 1307-1308 1319 
1338-1350 



11446 



fetal brain 



Invitrogen 



FBT002 



4151 120 151 192-194 264 504 512 
535 683 761 798 820-827 844 876 909 
963 976 1026 1048 1083 1144 1302 



fetal heart 



Invitrogen 



FHRO01 



fetal kidney 



Clontech 



FKD001 



fetal kidney 



Clontech 



FKD002 



fetal kidney 



Invitrogen 



FKD007 



446 566 761 



51 74 111 127 140 151 184294 537 
550 630-631 1319 



111976 1083 



238 974 



fetal lung 



Clontech 



FLG001 



463 566 976 1074 1083 1093 



fetal lung 



Invitrogen 



FLG003 



fetal liver-spleen 



Columbia 
University 



FLS001 



fetal liver-spleen 



Columbia 
University 



FLS002 



41 238 330407 415-416 537 573 844 
859 1048 1083 11161192 



8 14 34-35 37 41 43 49 51 54-56 W-04 
69-71 74 77 79 87-90 101 107 110-111 
114 120 128-131 138 140 147 150-155 
197 210 215 217 225 238 312 367 384 
414 440 446 460 468 483 496 504-507 
511-515 518-519 523 533-535 537 541 
544-545 547-550 555-560 564 566 571 
577 582 585-586 598 636 646-647 649 
652 664 698 709-710 714 722-723 731 
735-736 746-753 761 784 798 823 829 
832 844 851 858-859 868 873 876 898 
927 943 949 952 963 976 984 1002 
1021 1023 1040 1042 1044 1050 1083 
1093 11161120 1129 1131 1144 1174 
1217 1251 1254 1256 1302 1308 1311 
1319 



8 36-37 41-46 49 54 64 71 74 79 101 
111 120129 147207 210215-216238 
250 330 353 359 366 383-384 414 478 
505 508-509 511 515-524 534-535 537 
544-545 564 566 571 577591 598 638 



101 



WO 01/57188 



PCTYUS01/03800 



Tissue Origin 


RNA Source 


Hyseq Library Name 


SEQIDNOS: 








fsf\"X A71 AQfi 11 A TOO TIG. Ttl TCI nno 

ooj o/i oyo /14 ill flo ill tjl 79o 

RSI RSQ 8*71 R7£ QftQ OT7 QAQ Q<7 OQ1 

984 1002 1023 1042-1044 1085 1095 ! 
1131 11441178 1199 1233 1240-1270 
1331 1340 


fetal liver-spleen 


Columbia 
University 


FLS003 


64 535 976 1256 


fetal liver 


Invitrogen 


FLV001 


8 101 120138 217446468 535 566 
580 722 730 749 844 91 S 943 975 1051 
1256 1331 


fetal liver 


Clontech 


FLV004 


537926 1256 


fetal muscle 


InvitroGren 

All 1IU w g VAl 


FMS001 


Ki 111 In 1ti{\ mr\ acia Atn /io i 

M Ul 264 312 369-370 404 417-421 

49< C37 CT7 COG <T1 OCT * 1 > 1 
IJZVQ izOo 


fetal muscle 


Invitrogen 


FMS002 


537 


fetal skin 


Invitrogen 


FSK001 


13-2632 41 5189107111 147151 
225 264 316 405 422429 488-494 496 
5 19 534-535 537 566 675 732 859 876- 

0*7*7 pno fM*7 r\Af\ ncA n« n*r^ i r\r\ i 

o / / 89 o 947 949-950 963 976 1 001 
106210761083 111711441165 1268 
1281 


fetal slrin 


Invitrogen 


FSK002 


537 812 


fetal spleen 


BioChain 


FSP001 


87 549 


umbilical cord 


BioChain 


FUC001 


27-33 41 49 151 215 238248-249301 
3 16 446 495-503 51 9 521 534-535 537 
582 634 691 877 883 927 944-950 963 
976 1001 1075 1142-1143 1171 1218 
1243 1308 


fetal brain 


GIBCO 


HFB001 


41 49 5779 87 103. Ill 120 132-135 
138 145 151 188 197 207215 238264 
27 1 294 3 1 6 367 4 14 440 446 466 504 
513-514 535 542-543 550 564 571 596 
635 648-654 675 71 1-715 722-723 798 
832 872 876 883 927 976 1095 1 144 
1168 1171 1178 1211 1335 


1UUVJ vUiiUgv 


Tn \i i trn (yah 

uiviuugcn 


nlVlrUUl 


238 


infant brain 


Columbia 
University 


IB2002 


49-50 77 81 89 105 111 136-138 140 
151 161 175-179185 216-217264295 
299 308-310 371-373 462 476 504 51 1- 
513 533 537 564 566 571 655-657 662 
083 716-720 723 752 790-803 829 832 
858-859 876 898 909 949 976 1045- 

1 (\A1 1 A7*C 1 AQ*7 1 rtfin 1 ftni 111/1 tr\*\ 

Imi IV/O-iUoV 1U9U 1093 1116 1122 

1 144 lOAO-IOII 100< 1071 1")<£. in a 
l IHH izuy-jzij IZxO l^JJ IZJU 13 17 

1341 


infant brain 


Columbia 
University 


IB2003 


41 50 77 104 132 215 238 508 512-513 
519 566 655 714 794 918 943 976 1067 
1092-1093 1233 


infant brain 


Columbia 
University 


IBM002 


311472-473 753 1214 


infant brain 


Columbia 
University 


IBS001 


51 111 376474790 876 949 1144 1204 
1221 


lung , fibroblast 


Strategene 


LFB001 


151 316 462 514 534 582 675 939 1131 


lung tumor 


Invitrogen 


LGT002 


1-741 7479 94 115 120 138-139 156 
215217 269 280 296 337 374-375 384 
404446454475-480498514 518-519 
522 537 545 564 577 597 653 658 705 
721-724 754-756 779 859 868 872-874 
876-877 919 927 949 951-952 959 976 
1002 1042 1048-1053 1076 1083 1088- 
1089 1131 1144-1147 1216-1218 1229 
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Tissue Origin 
lymphocytes 


RNA Source p 
ATCC 


mmmm 


SEQEDNOS: 
1293 1311 

41 74 111 132 151 253 316 446 
634 844 927 976 1085 1268 


leukocyte 


GIBCO 


LUC001 


8114174 86 91-98 101 109111 12U 
U7 151 212 215218 238 252 288 312- 
314.316 338 359408 427 443-447 505 
510 512 514 518 534 545 549-550 561 
564 566 571 577 580 582 587-609 615 
632-638 658-659 698 714 725-728 832 
836 841 859 866 873-874 882-883 918- 
01 Q Q77 943 952 963 976 1042 1076 
1083 1090 1148 1152 1168 1195 1219- 
1220 1224 


leukocyte 


Clontech 


LUC003 


74 100 215 232 238 339-341 446 545 
657 660 729 873 883 927 952 963 1008 
1042 1116 1120 1149-1150 1215 1222 


Melanoma from cell 
line ATCC #CRL 
1424 


Clontech 


MEL004 


210 215 238 342 534 545 592 Til 873 
919 929 939 952 976 1071 1118 1218 


mammary gland 


Invitrogen 


MMG001 


8-10 40-41 49 73 80 114 138-140 147 
01 1 n<t\-OSA 9 Ad 9Q7-9QQ ^05 377-378 
398 446 481-486 505 512 537 545 549 
571 592 725 730-733 816 829 836 844 
868 873 876-877 898 926 943 951-960 
963 976 995 1034 1042 1048 1054- 
1055 1076 1083 1091 1093 1116-1117 
11241152 1302 


induced neuron cells 


Strategene 


NTD003 


39101 111 138238 361 1225 1251 
1319 


"retinoid acid induced 

neuronal cells 

neuronal cells 


Strategene 
Strategene 


NTR001 
XI tt Toni 

N J UUvl 


74 225 976 

129 225 238 304 313 361 657 976 
976 


pituitary gland 

placenta 

prostate 


Clontech 
Clontech 
Clontech 


PIT004 

PLA003 

PRT001 


38 976 

1 1 1 188 238 257-258 564 724 961-966 
1067 1095 


rectum 


Invitrogen 




238 430-431 841 859 868 963 1001 
1136 


salivary gland 


Clontech 


SAL001 


8 151 402 432-433 446 496 868 952 
976 1083 1120 1151 1184 


*mall intestine 


Clontech 


SIN001 
SKM001 


' 8101 147 215 259-266 446 402 
545 592 660 789 836 866 873 927 952 
963 967-978 1042 1120 1152 1223- 
1224 

238 302 927 943 992 1031 


skeletal muscle 
spinal cord 


Clontech 
Clontech 


SPC001 


74 111 132151215-216 238 264267- 
270 343-344 353 379 516 537 566 740 
828 927 976 979-994 1092 1153-1159 
1225 1250 
698 8591042 


adult spleen 

stomach 


Clontech 
Clontech 


SPLcOl 
STO001 


'210 238 271-272 537 580 705 918 952 
995 1171 


LilaJoIiiUo 


Clontech 


THA002 


~ "61 219-220273-276 312 315 330 596 
963 996-1007 1059 1093 1160-1162 


thymus 


Clonetech 


THM001 


- 8 120 151 208 221 316-317 3X*t*y 
750 867 874 878-881 927 963 1023 
1083 1094-10961124 


thymus 


Clontech 


THMc02 


8 61 114 129 132 210 225 231 3UO 
317-319 336 340 359 380 398 446448- 
463 512 519 545 554 587 598 698 724- 
725 789 812 836 868 873 927 947 952 
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Tissue Origin 


RNA Source 


Hyseq Library Name 


SEQIDNOS: 








ota ^f\(\1 1A/40 iaoo lAoe iaat i'h/ 
9/0 1UU/ 104Z 1083 1085 1097-1116 

1122 1147 1177 1226-1229 1234 1311 

1313 


thyroid gland 


Clontech 


THR001 


144149 7694111 144 151 183 188 
210 217 222 253 264 271 277-286 294 
320-326 345-352 361 381-382446 467 
483 514 534 549-550 564 578 602 649 
844 882-883 927 950 956 976 1008- 
1028 1D76 1083 !! 17-! 120 1142 1163- 
1175 1230-1238 1308 


trachea 


Clontech 


TRC001 


223-225 238 287 353-354 514 
545 592 611 873 883-884 927 
952 1029-1031 1042 1151-1152 
1170 1176-1177 1239 


uterus 


Clontech 


UTR001 


151226288-290 355 537 877 
885-886 976 1001 1032-1033 
1232 



TABLE 2 



SEQ 

ID 

NO: 


nvVWOJ 1/1 1 

No. 




Description 


Smith- 
Waterman 
Score 


% 

Identity 


1 


B02829 


XlvJlUV aajJIGUa 


numan u proiem coupieo receptor nKUrj 

nrntpin QPfl m WfVlfl 


460 


100 


2 


G03564 


Homo sapiens 


nuiiitm oCkiciou pruicin, oJuV^ hj INU« /Otj. 


111 
ill 


51 


3 


R26173 


Homo sapiens 


P?^rt At Ayfsitf\r Va t\ o t*q a at% 1 a p /• o nil/von 

x <u i ui iviajuj i u ptuaucopiasuc aJlllgCn 

(CDR62) encoded by clone pY2. 


293 


76 


4 


L29536 


Homo sapiens 


calcium channel L-type alpha J subunit 


191 


65 


5 


Y94943 


Homo sapiens 


Human secreted protein clone ytl4_l protein 
sequence SEQ ID NO:92. 


251 


50 


6 


Ml 1507 


Homo sapiens 


transferrin receptor 


120 


95 


7 


AF099100 


Homo sapiens 


VVD-repeat protein 6 


1941 


93 1 


8 


Y92338 


Homo sapiens 


Human cancer associated antigen precursor from 
clone NY-REN-45. 


245 


82 1 


9 


G01343 


Homo sapiens 


Human secreted protein, SEQ ID NO: 5424. 


226 


91 


10 


AJ133798 


Homo sapiens 


copine VII protein 


1127 


68 


11 


G02449 


Homo sapiens 


Human secreted protein, SEQ ID NO: 6530. 


584 


99 


12 


X98330 


Homo sapiens 


ryanodine receptor 2 


282 


78 


13 


AL024498 


Homo sapiens 


dJ417M14.2 (novel serine/threonine-protein 
kinase (ortholog of mouse and rat MAK (male 
germ cell-associated kinase)) 


293 


100 


14 


AF045577 


Pan 

troglodytes 


olfactory receptor OR93Ch 


191 


34 


15 


G03131 


Homo sapiens 


Human secreted protein, SEQ ID NO: 72 12. 


93 


39 


16 


U26595 


Rattus 
norvegicus 


prostaglandin F2a receptor regulatory protein 
precursor 


569 


89 


17 " 


B08918 


Homo sapiens 


Human secreted protein sequence encoded by 
gene28SEQIDNO:75. 


99 


44 


18 


Y36203 


Homo sapiens 


Human secreted protein #75. 


165 


75 


19 


U15647 


Mus 

musculus 


reverse transcriptase 


106 


40 


20 


G02701 


Homo sapiens 


Human secreted protein, SEQ ID NO: 6782. 


544 


100 


21 


Y35923 


Homo sapiens 


Extended human secreted protein sequence, SEQ 
ID NO. 172. 


1691 


100 


22 


GO4030 


Homo sapiens 


Human secreted protein, SEQ ID NO: 8111. 


380 


96 


23 


G02455 


Homo sapiens 


Human secreted protein, SEQ ID NO: 6536. . 


123 


50 


24 


AF036329 


Homo sapiens 


gonadotropin-releasing hormone precursor, 
second form 


284 


90 


25 


G04067 ' 


Homo sapiens 


Human secreted protein, SEQ ID NO: 8148. 


96 


32 


26 


S80119 


Rattus sp. 


reverse transcriptase homolog 


100 


34 


27 


U83303 


Homo sapiens 


Hne-1 reverse transcriptase 


101 


35 


28 


G03267 


Homo sapiens 


Human secreted protein, SEQ ID NO: 7348. 


135 


45 
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* 
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ID 

NO: 


Accession 
No. 


Species 


Description 


Smith- 
Waterman 
Score 


" <x> 

Identity 






musculus 








75 


Y00826 


Rattus 
norvegicus 


gp210(AA 1-1886) 


413 


84 


76 


AF1 17754 


Homo sapiens 


thyroid hormone receptor-associated protein 
complex component TRAP240 


351 


54 


77 


Y38422 


Homo sapiens 


Human secreted protein. 


468 


76 


78 


" Y14596 


Homo sapiens 


Human T-type voltage-gated Ca channel alpha- 
M(hCavT3). 


1357 


99 


yy 


Y1459I 


Human 
papillomavim 
stype 68 


APM-1 protein 


767 


inn 


80 


AL 137802 


Homo sapiens 


dJ798A10.2 (KIAA0445 protein) 


71 


34 


81 


APO0O383 


Arabidopsis 
thaliana 


protein arginineN-methyltransferase-like protein 


359 


65 


82 


L46815 


Mus 

musculus 


DNA binding protein Rc 


895 


75 


83 


G01600 


Homo sapiens 


Human secreted protein, SEQ ID NO: 5681 . 


315 


96 


84 


Y53886 


Homo sapiens 


A suppressor of cytokine signalling protein 
designated HSCOP-6. 


538 


71 


85 


AB029002 


Homo sapiens 


jkjAAjQ70 protein 


134 


42 


86 


Y28678 


Homo sapiens 


Human cw272 7 secreted protein. 


325 


62 


87 


Y99368 


Homo sapiens 


Human PRO 1326 (UNQ686) amino acid 
sequence SEQIDNO:100. 


156 • 


AQ 

l Kj i 




AJ225124 


Mus 

musculus . 


hyperpolarization-acbvated cation channel, 
HAC3 


487 


95 


89 


AF177203 


Homo sapiens 


cerebral cell adhesion molecule 


290 


56 


90 


Y28280 


Homo sapiens 


Human G-protein coupled receptor GRIR-2. 


326 


79 


91 


U9891 


Homo sapiens 


polycystic kidney disease-associated protein 


1751 


95 


02 


AF064876 


Homo sapiens 


ion channel BCNG-1 


953 


yy 


93 


AF170723 


Homo sapiens 


protein kinase STK10 


401 


jj 


94 


X13292 


Trypanosoma 
brucei 


GPI-phosphoiipase C (AA 1 - 358) 


151 


37 


95 


Y34127 


Homo sapiens 


Human potassium channel K+Hnovl 1. 


661 


99 


96 


X03638 


Rattus 
norvegicus 


sodium channel protein I (aa 1-2009) 


l / /J 


yi 


97 


AF134213 


Homo sapiens 


ubiquitin-specific protease 


1995 


QQ 

yy 


98 


G00838 


Homo sapiens 


Human secreted protein, SEQ ID NO: 4919 


211 


JO 


99 


AF021935 


Rattus 
norvegicus 


mytonic dystrophy kinase-reiated Cdc42-binding 
kinase 


675 


48 


100 


AF279265 


Homo sapiens 


putative anion transporter 1 


867 


98 


101 


AC007878 


Homo sapiens 


match to nuclear protein, NP220; note: sequence 
difference at residue 58 


160 


60 


102 


U22829 


Mus 

musculus 


P2Y purinoceptor 


264 i 


42 


103 


Y45023 


Homo sapiens 


Human sensory transduction G-protein coupled 
receptor-B3. j 


516 


99 


104 


Y94990 


Homo sapiens 


Human secreted protein vb21 1, SEQ ID NO:20. 


787 


98 


105 


Y87342 


Homo sapiens 


Human signal peptide containing protein HSPP- 
119SEQ1DN0:119. 


343 


57 


106 


AF169312 


Homo sapiens 


hepatic angiopoietin-rclated protein 


212 


67 


107 


AF116657 


Homo sapiens 


PRO1310 


74 


52 


108 


AE000401 


Escherichia 
coii 


sialic acid transporter 


587 


96 


109 


Y38395 


Homo sapiens 


Human secreted protein encoded by gene No. 10. 


693 


IDo 


110 


Y78801 


Homo sapiens 


Hydrophobic domain containing protein clone 
HP0063 1 amino acid sequence. 


182 


94 


111 


Z25535 


Homo sapiens 


nuclear pore complex protein hnupl53 


464 


85 


112 


Y9493& 


Homo sapiens 


Human secreted protein clone ye90J protein 
sequence SEQ ID NO:84. 


274 


51 


113 


AF016365 


Homo sapiens 


hexokinase 1 isoform td 


301 


71 


114 


AC007956 


Homo sapiens 


unknown 


520 


75 


115 


M83738 i 


Homo sapiens 


protein-tyrosine phosphatase 


251 


92 


116 


AL 157952 


Homo sapiens 


dJ875K 15.1.1 (ets homologous factor (ets- 
domain transcription factor ESE-3A, isoform 1)) 


484 


91 


117 


W18084 


Homo sapiens 


Human Aurora-2. 


546 


87 



106 



WO 01757188 



PCT/US01/03800 



SEQ i 
ID 1 
NO: 


Accession 1 J 
fo. 1 


Jpedes l 


description * 
:am kinase I ~ ! 


Smith- 1 
Waterman 1 
Score 

107 < 


[dentity 
52 


118 
119 


.41816 1 3 
AJ006710 J 
j 


Jomo sapiens < 
<auus ] 
lorvegicus 


)hbsphatidylinositol 3 -kinase ' 


527 ! 




120 


AFU26954 J 


503 XaUTUd 


jyruvate dehydrogenase phosphatase regulatory 
subunit precursor, PDPr 


1646 


94 


121 

122 


S39392 | 
U60805 


riorno sapiens 
Hfomo sapiens 


)rotein tyrosine phosphatase, FTPase {EC 
3.1.3.48} 

oncostatin-M specific receptor beta subunit 


373 
262 


68 
88 


| 123 
124 


Y44403 1 
U88167 


Homo sapiens 
Caenorhabditi 
s elegans _ _ 


Human truncated tankyrase-l. 
contains similarity to Ul domains 


111 


35 
29 


125 


AF300648 


Homo sapiens 


guanine nucleotide binding protein beta subunit 
4 


693 


90 


126 


AB021861 


Mus 

musculus 


apoptosis signal-regulating kinase 2 


153 


65 


127 " " 
128 


AF305210 
M90360 


Homo sapiens 
Homo sapiens 


concentratWeNa+-nucleoside cotransporter 
hCNT3 

nrrrff^in It in ft S£ 


807 

220 
574 


97 

73 
86 


129 
1 130 
131 

132 


D32202 

AF208043 

AF201734 

AF112886 5 


Homo sapiens 
Homo sapiens 
Mus 

musculus 
Bos taurus 


alpha 1C adrenergic receptor isoform 2 
IFI16b 

testis specific serine kinase-3 
differentiation enhancing factor 1 


496 
800 

-159 


67 
87 

74 


133 
1 134 

13$ 


AJ278314 
W74802 

AB020335 


Homo sapiens 
Homo sapiens 

Homo sapiens 


)hospholipase C-beta-lb 

' luman secreted protein encoded by gene 73 
clone HSQEL25. 
Pancreas-specific gene 


554 
1157 

"161 


85 
87 

~ 

"98 


136 
137 


W80408 
AC002563 


Homo sapiens 
Homo sapiens 


A secreted protein encoded by clone dt674_2. 
putative RHO/RAC effector protein; 95% 
similarity to P49205 (PID:gl345860) 
PR03434 a novel secreted protein. 


866 
891 


QQ 

100 


138 
|T39 

140 


Y96736 
AB024034 

W97809 


Homo sapiens 
Arabidopsis 
thaliana 
xiomo sapiens 


DNA-damage inducible protein DDIl-like 
Human GTPase regulator GRAF. 


147 
248 


55 
56 


1 141 
142 


Y51557 
AF090113 


Homo sapiens 
i Rattus 


Human PLA2 protein. 

AMPA receptor binding protein 

Human RECK cancer-inhibiting protein. 


125 
623 
_ 


46 
93 

"82 


143 
I 144 


W26642 
U87306 


Homo sapiens 

Rattus 

norvegicus 


transmembrane receptor UNC5H2 


578 


84 


145 
146 


AF264014 
W63683 


Homo sapiens 
1 Homo sapiens 


scavenger receptor cysteine-rich type 1 protem 
M 160 precursor 

Human secreted protein 3. 


727 
140 


92 
40 


1 147 
148 


M96264 
D64014 


1 Homo sapiens 
1 Escherichia 


galactose-l-phosphate uridyl transferase 
HrsA 


513 
818 


81 
90 


149 


M83316 


Escherichia 
coii 


pppGpp phosphohydrolase 


915 


95 


150 


" AU&279 


Homo sapiens 


homolog to cAMP response element binding and 
beta transducin family proteins 


1261 
_ 940 " 


99 
'"99 


151 
152 

1 153 
154 


" AF179867 
R95332 

AF151859 
X66957 


Homo sapiens 
1 Homo sapiens 

Homo sapiens 
1 Homo sapiens 


STE20-like kinase 
" Tumor necrosis factor receptor 1 death domam 
ligand (clone 3TW). 
CGM01 protein 
hexokinase type 1 


392 

370 
489 


61 

92 
81 

to 


| 155 
156 
157 


Y16355 
G00857 
AF159455 


Homo sapiens 
Homo sapiens 
Mus 

musculus 


alternatively spliced torm 
" Human secreted protein, SEQ ID NO: 4938. 
zinc finger protein 

interleukin-l receptor-associated kinase 


432 

14.0 

352 
537 


78 
74 

76 


158 
159 


" L76191 
"AP001743 


| Homo sapiens 
Homo sapiens 


r putative gene, ankirin like, possible dual 

specifity Scr/Thr/Tyr kinase domain 


670 


98 


160 
] 161 


AJ250425 
' G02885 


iggj 


Collybistml 
r[ Human secreted protein, &KQ ID NO: 6966. 


556 
370 


74 
100 



107 



WO 01757188 PCT/US01/03800 



SEQ 
ID 

NO: 


Accession 
No. 


Species 


Description 


Smith* 

Waterman 

Score 


% 

Identity 


162 


Z22968 


Homo sapiens 


Ml 30 antigen 


610 


100 


163 


AF18112I 


Homo sapiens 


ATP-dependent Ca2+ pump PMR1 


336 


92 


164 


AF055636 


Homo sapiens 


leucine-rich ^lioma-inactivated protein precursor 


455 


94 


165 


AF160798 


Rattus 
norvegicus 


calcium transoorter CaTI 


700 


OA 
7v 


166 


Y76332 


Homo sapiens 


Frafiment of human secreted nrotein encoded hv 
gene 38. 


327 


*♦ j 


167 


Y48607 


Homo saniens 


Human hrft«5t tuiiiDur^ssocifiied "rotsilT 68. 


1072 


nn 


168 


AB020741 


Mus 

musculus 


NIK-related kinase 


197 


*tj 




Al*252203 " 


Homo sapiens 


PAR3 




AA 


170 


U59429 


Cricctinac 
cen. sd. 


diacylglycerol kinase eta 


481 


82 


171 


AF035268 


Homo sapiens 


nhn^nhafirivlctfinp-cnpH'fip nhncnhriHnncp A1 






172 


AF127085 


Mus 

musculus 


semaphortn cytoplasmic domain-associated 

nrotein 


io1 


n 


173 


Y27918 


Homo sapiens 


Human secreted protein encoded by gene No. 
123. 


653 


99 


174 


G02979 


Homo saniens 


Human cmtpIpH nrntem ^PfWmvIfV 7fWO 




07 

yt 


175 


U36488 


Mus 

musculus 




"Tar 

loo 


*K 

jj 


176 


W95629 


Homo sapiens 


Homo <ymif*ns wrplpiinrntpin apnp rlnnp 

gml96 4. 


1077 


inn 


177 


AF2S9023 


Homo sapiens 


formiminotransferase cyclodeaminase form D 


255 


93 


178 


X04936 


Homo sapiens 


T-cell recentor fllnhn^hnin ^411 is *)n<\ Kacp in 

codon) 




QQ 


179 


AF127481 


Homo sapiens 


non-ocogenic Rho GTPase-specific GTP 
exchange factor 


175 


flu 


180 


G00978 


Homo sapiens 


Human secreted protein, SEQ ID NO: 5059. 




Q4 


181 


Y66645 


Homo sapiens . 


Membrane-bound protein PRO 13 10. 






182 


AFI 10640 


Homo sapiens 


omhan 'jeven-tran^mpmhrflnp rp^pntor 




ion 

1UU 


183 


AB020854 


Bos taurus 


orphan transporter short splicing variant 


766 ^ 


84 


184 


AF169691 


Homo ssnien'i 


111 ll&C px VlCili Y i\0 


31 J 


79 
JO 


185 


AF126372 


Homo sapiens 


thyrotropin-releasing hormone degrading 


985 


99 


186 


L20966 


Homo sapiens 


phosphodiesterase 


541 


76 


187 


G02920 


Homo <uvnien<i 


Hltmfln Qprrp+pH nrMAtn CRl^ TFlXI/V 7001 
nULUtUl bCLiCUJU pjOlcin, OLrKl Us INU. / Wl. 


HA " 


07 


188 


Y94918 


Homo sanien«s 


TTiirnnn wrrrfpH nrnfpm rlr»np rWSO/l 1 Q nmt«n 

iiuniau »cwclcu pruiciu wuue uujv4_iD proicin 
sequence SEQ ID NO;42. 


3\)l 


no 
70 


189 


Y66713 


Homo sapiens 


Mptnhrfinp-hmmH nrntpin PRHnfK) 




TOO 
IUU 


190 


G03244 


Homo sapiens 


Human secreted protein, SEQ ID NO: 7325. 


331 


73 


191 


U36771 


Rattus 
norvegicus 


tfl-clvr prof 3-nhfKnhAtP nrvlfrnnrfprnep 


707 




192 


R05935 


Homo ^aniens 


SpprptpH frPlTh Cllhiinit nf multifile ciiknnt+ 

ocucicu vjr iiu suDujui \ji multiple iUDunu 
polypeptide (MSP)GPHWIIa. 


t ^7 


11 


193 


M92084 


Theiteria 
parva 


casein kinase II alpha subunit 


364 


50 


194 


Y66645 


Homo sapiens 


Mernbrane-hnund nrntein PRD1 1 1 A 




QO 
7U 


195 


W95631 


Homo sapiens 


Momo Ssnipn^ CPfrptpH nmtpin optip Hnn#» 

hj968 2. 




47 


196 


AF255614 


Rattus 
norvegicus 


scaffolding protein SL1PR 


680 


QQ 
77 


197 


AC021640 


Arabidopsis 
thaliana 


DUtative Dhosohatidale nhosnhohvdrnlflsp. 


300 


41 


198 


AF073967 


Mus 

musculus 
domesticus 


olfactory receptor 


316 


43 


199 


W01730 


Homo sapiens 


Human G-protein receptor HPRAJ70. 


617 


98 


200 


AFU7948 


Homo sapiens 


pancreas-enriched phospholipase C 


625 


89 


201 


AF128625 


Homo sapiens 


CDC42-binding protein kinase beta 


636 


94 


202 


AFU7946 


Homo sapiens 


Link guanine nucleotide exchange factor II 


1303 


100 


203 


Y53021 


Homo sapiens 


Human secreted protein clone qc646_l protein 
sequence SEQ ID NO;48. 


701 


99 


204 


AF227968 


Homo sapiens 


SH2-B beta signaling protein 


182 


79 


205 


S81752 


Homo sapiens 


DPH21>candidate tumor suppressor gene 


375 


100 



108 



WO 01/57188 



PO7US01/03800 



ID 1 
NO: 

206 


S'o. 

U18315 


?necies 1 
Sus scrofa 


Description 

{ovarian cancer critical region of deletion} 
parathyroid receptor 


Smith- ' 
Waterman 1 
Score 

122~~" 
170 


Identity 

60 
96 


207 

2Uo 

209 
210 


AF255342 
Q<7051 S 

W63683 
D79992 


Homo sapiens 

Homo sapiens 
Homo sapiens 


putative pheromone receptor V1RL1 long lorm 

neurotransmitter transporter ■ 

Human secreted protein 3. 

simitar to Drosophila photoreceptor cell-specific 

protein, calphotin. 


15 
840 
541 

1348 


94 
99 
82 

99 


212 


API 17948 

nr 11/ 7"o 

U81035 


Homo sapiens 1 
Rattus j 
norvegicus j 


pancreas-enriched phospholipase C 
ankyrin binding cell adhesion molecule 
neurofascin 


471 

798 " 


69 
56 


l 213 


AF154846 
AF 102777 

rVT IV* fit 


Homo sapiens 
Mus 

musculus i 


zinc finger protein ■ 

FYVE finger-containing phosphoinositidelanase 

putative gene containing transmembrane domain 


933 

523 


93 ! 
89 


215 
216 

1 in 
| 217 


AL163303 
U26595 

UU1v?J 


Homo sapiens 

Rattus 

norvegicus 


prostaglandin F2a receptor regulatory protein 
precursor 

Human secreted protein, SEQ ID NO: 8176. 


563 

644 
314 


78 

98 
81 


| 218 
219 
220 


X75756 
Y66723 
D88577 


Homo sapiens 
Homo sapiens 
Mus \ 
musculus | 


protein kinase C mu 
Membrane-bound protein PROl 100. 
Kupffer ceil receptor 


770 

JO/ 

853 


98 
40 

100 


221 

222 


A T-H CO AtLK. 

AF25o4o5 
AF021935 


Urvmrt canipflC 1 

Rattus | 
norvegicus 


OTRPC4 

mytonic dystrophy kinase-related Cdc42-binding 
kinase 


636 


96 


223 


AL136527 


Homo sapiens 


"bA215B13.l (A kinase (PRKA) anchor protein 
10 


690 


100 

Xv\> 

99 


224 
225 


AB032417 
At(/jU43U 


Homo sapiens 
vlus 

musculus 


WNT receptor Frizzled-4 
scmaphorin Via 


703 


68 


226 


AE000218 


Escherichia 
coli 


putative dftiydroxyacetone kinase (EC 2.7. 1 2) 


297 
2080 


39 
100 


227 
228 


AF302150 
AB024573 


Homo sapiens 
Mus 

musculus 


phosphoinositol 3-phosphate-bmdmg protein-2 
GTP-binding like protein 2 


265 


"88 


229 
236 


AF122924 
G03205 


Xenopus 
laevis 

Homo sapiens 


i Wnt inhibitory factor- 1 

[Human secreted protein, SJby m nu, /^oo. 


316 

229 
265 


40 

100 
92 


231 
232 
433 

234 


X98260 
R92754 
R75111 

" W69431 
Y08686 


Homo sapiens 
Homo sapiens 
Homo sapiens 

Homo sapiens 
Homo sapiens 


M-phasephosphoprotein 11 
1 Human growth differentiation factor-12. 
1 Glycosyl-phosphaiidylinositol-specific 

1 phospholipase-D. . 

1 Human secreted protein cw!233_3. 

| serine palmitoyltransferase, subunitll 


682 
290 

235 

159 ~~ 


95 
100 

"97" 
81 


[235 
2$6 
237 


AF 118275 
X81466 


Homo sapiens 
Mus 

mil villus 


j atrophin-related protein ARP 
Embryo Brain Kinase 


460 


37 
62 


238 


U64857 


Caenoittabditi 
s elegans 


similar to the BPTI/Kunitz family of inhibitors; 
I most similar to tissue factor pathway inhibitor 
precursor (TFPI) 


284 


33 


239 


AJ250840 


Mus 

musculus 


1 serine/threonine protein kinase 


739 


63 


240 


AJ223472 


Mus 

musculus 


transcription elongation factor TFIIS.h 


222 


38 


241 


Y94906 


Homo sapiens 


Human secreted protein clone rb649_3 protein 
sequence SEQ ID NO;18. 
1 Na+/sulfate cotransporter SUT-1 


353 
591 


""52 

-"99 


] 242 
243 


L22022 


Homo saniens 

Rattus 

norvegicus 


orphan transporter v7-3 


667 


93 


244 

245 
246 
247 
248 
249 


AF01^191 

AF097366 

Y29868 

AF180475 

Y17227 

AF250910 


Rattus 
norvegicus 
Homo sapiens 
^Homo sapiens 
Homo sapiens 
Homo sapiens 
Manduca 


potassium channel 

| cone sodium-calcium potassium exchanger 
| Human secreted protein clone pp325 9. 

Not4-Np 

1 Human secreted protein (clone yal-1). 

1 death-associated small cytoplasmic leucine-rich 


1043 

645 
497 
188 
690 
182 


98 

98 
98 
83 
99 
31 



109 



WO 01/57188 



PCT/US01/03800 



SEQ 
ID 

NO: 


Accession 
No. 


Species 
sexta 


Description 
protein SCLP 


Smith- 
Waterman 
Score 


% 

Identity 


250 
251 


AF192756 
AB022694 " 


Kaposi's 
sarcoma- 
associated 
herpesvirus 
Homo sapiens 


Orf73 

MOK protein kinase 


134 
l\)y 


34 
83 


252 
253 

1<j\ 


W55W5 
L46815 ' 

WoojUD 


liomo sapiens 
Mus 

musculus 
Homo sapiens 


Neural adhesion molecule {ethbOOISC product). 
DNA binding protein Rc 

_[ Human acid sensing ionic channel. 


469 

251 


100 

£1 

\*i 


255 
256 


AF070066 
G02491 


Mus 

musculus 
Homo sapiens 


Citron-K kinase 
1 Human secreted protein, SEQ ID NO: 6572. 


173 
1201 


82 
98 


257 
258 


Y95436 


Oryctolagus 
cuniculus 
Homo sapiens 


Phospholipase ~ " 


460 
368 


100 
80 


"260 


AJzzzyoo 
AJ250839 


Mus 

musculus 
Homo sapiens 


Human calcium channel SOC-3/CRAC-2. 
L-periaxin 

serine/threonine nrotein kinac* 


1857 
430 

861 


99 
72 

100 


261 
262 

263 


AJ249977 
AF141386 

AF022859 


Homo sapiens 
Rattus 
norvegicus 
Homo sapiens 


. AMP-activated protein kinase gamma 3 subunit 
SUT-2 

neuropilin-2(a0) 


758 
198 

335 


98 
40 

& 


264 
265 

"ICC 


AF 160477 
Y44662 


Homo sapiens 
Homo sapiens 


Ig superfamiJy receptor LNJR precursor 
Human 14273 G-protein coupled receptor 
(GPCR). 


387 
636 


91 

99 


zoo 

207 


U27269 
AF 124491 


Mus 

musculus 
Homo sapiens 


sodium glucose cotransporter 

ARF GTPase-activaUng protein GIT2 


204 


56 


268 
269 


AF 127389 
X98296 


Rattus 
norvegicus 
Homo sapiens- 


putative taste receptor TR1 
ubiquitin hydrolase 


159 
209 


75 
39 


270 


X78482 


Streptococcus 
pyogenes 


Fc-gamma receptor 


215 
129 


95 
26 


271 


AB009883 


Nicotiana 
tabacum 


KED 


109 


26 


272 


AF137367 


Mus 

musculus 


VI Rill riAtnain va/»am^a« nmaI.! m O/^Ti/^io i 

v i o j u uomain receptor protein oORCS 


899 


97 


273 


L34938 


Rattus 
norvegicus 


ionotropic glutamate receptor 


460 


86 


274 


AL022724 


Homo sapiens 


dJ413H6.1.1 (hamster Androgen-dependent 
Expressed Protein LIKE PITTATrvp r»rr»N»i«\ 
(isoform 1) 


188 


74 


275 

276 
111 


AJR265555 
L40380 


Homo sapiens 

Homo sapiens 
Homo sapiens 


ubiquitin-conjugating BIR-domam enzyme 
APOLLON 

Human secreted protein, SEQ ID NO: 6953. 
thyroid receptor interactor 


173 
148 


94 
56 


278 
279 

280 
281 


AC008075 

M83738 
AK024397 


Homo sapiens 
Arabidopsis 
thaliana 
Homo sapiens 
Homo sapiens 


K1AA1 631 protein 

Contains PF|0t)l)69 kukaryotic protein kinase 
domain. 

protein-tyrosine phosphatase 


430 
283 
157 

181 


61 
96 
43 

73 


"282 
"283 


AF141326 
AF156530 


Homo sapiens 
Mus 

musculus 


unnamed protein product 

RNA helicase HDB/DICEl 

til s-domain transcriptional repressor PEI 


439 
497 
605 


91 
84 
76 


284 


Y29336 


Homo sapiens 


Human secreted protein clone cs756 2 alternate ~ 
reading frame protein. 


04/ 


100 


285 

286 
287 
288 
289 
290 
291 


Y73402 

AF016411 
W89253 

AF112886 ~ 
AF1 13131 
U52111 

AFU26504 3 


Homo sapiens 

Homo sapiens 
tfomo sapiens 
Bos taurus 
Homo sapiens 
lomo sapiens 
Rattus 


Human secreted protein clone yc25 1 protein 
sequence SEQ ID NO:26. 
KCNA3.1B 
Human ALP. 

differentiation enhancing factor 1 

lost cell factor homolog LCP 

Jlexin-related protein < 

SPA-l like protein p!294 " "j , 


300 

137 
>88 
750 
367 
598 
503 


90 

100 

97 

% 

U 

100 

sp 



110 



PCT/US01/03800 

WO 01/57188 



SEQ 
ID 

NO: 



Accession 
No. 



Species 



Description 



Smith- 
Waterman 
Score 



% 

Identity 



292 



293 



294 



295 



296 
297 



^98 



AF102854 



norvegicus 



X99211 



Rattus 

norvegicus 

Drosophila 



membrane-associated guanylate kinase- 
intcracting protein 2 Maguin-2 



124 



ubiquitin-specific protease 



143 



Y94943 



Homo sapiens 



Human secreted protein clone ytl4_l protein 
sequence SEQ ID NO:92. 



185 



Y94890 



Homo sapiens Human protein clone HP02798. 



AF019767 



Homo sapiens 



Y28568 



Homo sapiens 



Y94943 



\ bmo sapiens 



108 



zinc finger protein 



Secreted peptide clone bd577_l. 



Human secreted protein clone ytl4_l protein 
sequence SEQ n>NO:92, 



182 



53 



38 



94" 



39_ 
96 



97 



299 



300 
301 



B08906 



Homo sapiens 



4uman secreted protein sequence encoded by 
gene 1 6 SEQ ID NO:63. 



R58890 



lomo sapiens 



rluman-32 cadhcrin-rclated molecule. 



212 
277 



97 
100 



AF022859 



Y71124 



; lomo sapiens 



neuropilin-2(a0) 



Homo sapiens 



Human mitogenic regulator duox2. 



716 



97 



303 



304 
305 



Y44297 



D32050 



Homo sapiens 



Human receptor tyrosine kinase. 



lomo sapiens 



alanyl-tRNA synthetase 



192 



80_ 

72 



U43586 



Homo sapiens 



protein kinase related to Raf protein kinases; 
Method: conceptual translation supplied by 
author 



R54S72 



Homo sapiens 



Human H13 viral receptor mutant 4. 



280 



307 



D78572 



Mus 

musculus 



membrane glycoprotein 



308 



309 



310 
311 



AF255614 



Rattus 
norvegicus 



scaffolding protein SIIPR 



S79463 



Mus sp. 



AF178941 



-lomo sapiens 



scmaphorin homoloff=M-Scma F 



162 



ATP-binding cassette sub-family A member 2 



736 



89 
100 



312 



313 



314 



U03413 



Dictyostelium 
discoideum 



calcium binding protein 



Y87347 " 



Homo sapiens 



Human signal peptide containing protein HSPP- 
124SEQIDN0:124. 



744 



Z97055 



Homo sapiens 



dJ388M5.4 (putative GS2 like protein) 



789 



AC004010 



Homo sapiens 



similar to Leucine-rich transmembrane proteins; 
44% similarity to U42767 (PID.gl736918) 



wr 



100 
99~~ 



38 



315 



316 



AL021392 



Homo sapiens 



dJ439F8.2 (supported by GENSCAN and 
GENE WISE) 



U70209 



Mus 

musculus 



polycystic kidney disease 1 protein 
coxsackie-adenovirus-receptor homolog 



165 



38 



317 



318 



319 



320 
321 



AF109643 



AF104923 



Rattus 
norvegicus 



Homo sapiens 



putative transcription 



138 



AF100287 



Trypanosoma 



activated protein kinase C receptor homolog 



T4T 



G00588 



Homo sapiens 



Human secreted protein, SEQ ID NO: 4669. 



125 
459 



84 



38" 



51 
97 



322 



323 



324 



325 



326 



327 
328 



Y21591 



Homo sapiens 



Human secreted protein (clone CC332-33). 



D26070 



Homo sapiens 



Y27918 



Homo sapiens 



AF010144 



Homo sapiens^ 



M19650 



Homo sapiens 



W80396 



Homo sapiens 



human type 1 inositol 1,4,5-trisphosphate 
receptor 



232" 



Human secreted protein encoded by gene No. 
123. 



306 



neuronal thread protein AD7C-NTP 



209 



2 , ,3 , -cyclic-nucleotide 3'-phosphodiesterase (EC 



214 



3.1.437) 



X75756 



Homo sapiens 



A secreted protein encoded by clone bp646_10. 
protein kinase C mu 



140 



540 



97 



70 



97 



70 



78 



329 



330 



G02292 



Homo sapiens 



Human secreted protein, SEQ ID NO: 6373. 



AF168990 



Homo sapiens 



putative GTP-binding protein 



877 



S67984 



Homo sapiens 



anti-HIV gpl20 antibody heavy chain variable 
region 



581 



99 



80 



331 



332 



X13916 



Homo sapiens 



LDL-receptor related precursor (AA -19 to 4525) 



Y87330 



Homo sapiens 



Human signal peptide containing protein HSPP- 
107SEQIPNO:107. 



2823 
1127 



100 
98~ 




111 



WO 01/57188 PCT/US01/03800 



SEQ 
ID 

NO: 


Accession 
Na 


Species 


Description 


OUUUi 

it amnion 

Score 


70 








similarity to P49205 (PID.gl 345860) 






335 


Y87347 


Homo sapiens 


Human signal peptide containing protein HSPP- 
124SEQIDNO:124. 


1111 

A ill 


67 


336 


AF006466 


Mus 

musculus 


lymphocyte specific forrain related protein 


193" " 


7i 

tj 


337 


AF265555 


Homo sapiens 


ubiquitin-conjugating BIR-domain enzyme 
APOLLON 


632 


97 


338 


Y13443 


Homo sapiens 


Amino acid seauence of hS!o3-2, 


515 


1 nf\ 
JVW 


339 


Y07637" 


Homo sapiens 


putative GABA-gated chloride channel 


189 


100 


340 


Y05734 


Homo sapiens 


Human Grb7 effector 22412 protein. 


a jo 


00 

yy 


341 


AE000497 


Escherichia 
coli 


L-idonate transcriptional regulator 


928 


98 


342 


D90855 


Escherichia 
coli 


glycerol-3-phosphate dehydrogenase (EC 
1.1.99 5) chain A. anaerobic 


769 


99 


343 


D85613 


Escherichia 
coli 


membrane component 


399 


100 


344 


M93239 


Escherichia 
coli 


transmembrane protein 


232 


100 


345 


M60177 


Escherichia 
coli 


enterobactin 


759 


99 


346 


D90699 


Escherichia 
coli 


Sensor protein copS (EC 2.7.3.-). 


638 


97 


347 


D90843 


Escherichia 
coli 


CapB protein. 


552 


100 


348 


Ml 3422 


utiwuvi i villa 

coli 


A.Q lrH nmtmn 
f7 Ml pi ULCill 


1193 


96 


349 


LI 0328 


ovitci luiia 

coli 


similar to drug resistance translocases 


340 


90 


350 


X69942 


Mus 

musculus 


enhancer-trap-locus-1 


560 


82 


351 


AF239613 


Homo sapiens 


ayauiui-scimiivc Sulall^QuQUCiaTlGG V^a£T* 
*v" » fttvu puuicoiuui vfiallilCi 


4f>3 


80 


352 


D90777 


Escherichia 
coli 


3-hydroxybutyryl-CoA dehydrogenase (EC 

111 1571 fl>- hvdrravhntvrvUPnA 

dehydrogenase) (BhbD). 


577 


100 


353 


D90863 


Escherichia 
coli 


similar to 


^1 1 

311 


no 

yo 


354 


Y52386 


Homo sapiens 


Human transmembrane protein HP02000. 


133 


58 


355 


Y31645 


Homo sapiens 


Human transport-associated protein-7 (TRANP- 
7). 


482 


J j 


356 


Y58637 


Homo sapiens 


Protein regulating gene expression PRGE-30. 


119 


5I 


357 




Homo sapiens 


dual-specificity tyrosine phosphatase YVH1 


1788 


100 


358 


Y87219 


Homo sapiens 


Human secreted protein sequence SEQ ID 
NO:258. 


165 


i oo 


359 


J00132 


Homo sapiens 


beta-fibrinogen 


233 


93 


360 


G03789 


Homo sapiens 


Human secreted protein, SEQ ID NO: 7870. 


128 


70 


361 


R28916 


Homo sapiens 


Type III procollagen (prior art). 


108 


40 


362 


U16655 


Rattus 
norvegicus 


phospholipase C delta-4 


649 


65 


363 


G03119 


Homo sapiens 


Human secreted protein, SEQ ID NO: 7200. 


95 


42 


364 


U47276 


Gallus gallus 


chicken brain factor-2 


104 


34 


365 


G03789 


Homo sapiens 


Human secreted protein, SEQ ID NO: 7870. 


183 


65 


366 


G04091 


Homo sapiens 


Human secreted protein, SEQ ID NO; 8172. 


118 


46 


367 


X98258 


Homo sapiens 


M-phase phosphoprotein 9 


564 


75 


368 


AL021366 


Homo sapiens 


CICK0721Q.3 (Kinesin related protein) 


3387 


99 


369 


U70932 


Peromyscus 
leucopus 


reverse transcriptase 


92 


5.9 


370 


X86400 


Homo sapiens 


gamma subunit of sodium potassium ATPase 
like 


242 


73 


371 


G03172 


Homo sapiens 


Human secreted piotein, SEQ ID NO: 7253. 


165 


56 


372 


U49974 


Homo sapiens 


mariner transposase 1 


257 


55 


373 


X13916 


Homo sapiens 


LDL-receptor related precursor (AA -19 to 4525) 


21193 


99 


374 


AF234765 


Rattus 
norvegicus 


serine-arginine-rich splicing regulatory protein 
SRRP86 


U82 


78 




U49974 


Homo sapiens 


mariner transposase 


172 


55 



112 



WO 01/57188 



PCT/US01/03800 



SEQ i 
ID 1 
NO: 
376 


\ccession ] ! 
G01984 


Species 1 
Homo sapiens \ 


Description ^ 
Human secreted protein, SBQ ID NO: 6065. 


Smith- 
Waterman 1 
Score 
21 


Identity 
SI 


377 
378 

379 


G00669 
X52574 

R69095 


aomo sapiens J 
Mus 

musculus 
Homo sapiens 


Human secreted protein, SEQ 1U NO: 4750. 
OTP binding protein 

Anti-HIVFabtat31 light chain. 
a\nUaJ> tvne XT collaeen 


00 
"456 

58 
125 


100 
91 

37 
37 


380 
381 
382 

383 
384 
385 
386 
387 
388 


J04974 

AB0024O5 

U64830 

G02916 
G01194 
AJ245822 | 
D86974 
G03203 | 
G04072 


Homo sapiens _ 
Homo sapiens 
Dictyostelium 
discoideum 
Homo sapiens 
Homo sapiens 
Homo sapiens 
Homo sapiens 
Homo sapiens 
Homo sapiens 


LAK-4p 

protein tyrosine kinase 

Human secreted protein, SEQ ID NU: oyy >. 
Human secreted protein, SEQ ID NO: 5275. 

type I transmembrane receptor 

KIAA0220 

Human secreted orotein, SEQ ID NO: 7284. 
Human secreted protein, SEQ ID NO: 8153. 


530 
115 

618 

617 

4560 

2148 

142 

99 ! 


43 
44 

98 

93 

100 

98 

SO 

-so 


389 
390 
391 
392 
393 
394 
395 


M12140 
AJ293309 i 
Y42751 [ 
W48351 1 
Y14442 
W85607 
Y76332 ! 


Homo sapiens 
Homo sapiens 
Homo sapiens 
Homo sapiens 
Homo sapiens 
Homo sapiens 
Homo sapiens 


envelope protein 
NHP2 protein 

Human calcium binding protein 2 (CaBf-2). 
Human breast cancer related protein BCRB2. 
olfactory receptor protein 

Secreted protein clone da228 J>. 

Fragment of human secreted protein encoded by 
gene 38. 

' xj.".' m on t>0i*r*»4-*>si nmtpin SFO TD NO: 8011. 


197 
461 
181 
241 
339 
957 
171 

250 


51 
77 

OA 

66 
54 

1 An 
1UU 

34 
100 


396 
397 


G03930 
AB032904 


Homo sapiens 

Hylobates 

syndactylus 


rluman secrercu proiciu, oi-iv< " w * 
dopamine receptor D4 


105 
861 


35 
.85 


398 
399 


AJ007798 1 
Y91405 


Homo sapiens 
Homo sapiens 


stromal antigen j, iji nsjjj . 

Human secreted protein sequence encoded by 

gene 2 SEQ ID NO: 126. 

Human secreted protein clone cb98 4. 


"1047" 
162 


92 
37 


400 
401 

402 


Y29861 
D87002 

AF 100754 


Homo sapiens 
Homo sapiens 

Homo sapiens 


similar to rat integral membrane glycoprotein, 

accession number Z21513. 

ancient ubiquitous protein AUP1 isoform 


527 
853 


78 
95 


403 
404 


X74904 
AF075462 


! Gallus gallus 
Mus 

musculus 


alDha-2-macroglobulin receptor 

atvd rttirvci/latinn fflrtnr-directsd GTPase 

activating protein isoform b 


258 
545 


60 
89 


405 
406 


X92887 
Y30162 


Human 
J endogenous 
1 retrovirus K 
1 Homo sapiens 


pol/env 

Human dorsal root receptor 4 hDRR.4. 


T62 

325 
2833 


30 

72 
99 


407 
408 
409 


AK022626 

L13802 

Y91600 


1 Homo sapiens 
I Homo sapiens 
Homo sapiens 


unnameo protein proum^ 

ribosmal protein small subunit 

Human secreted protein sequence encoded by 

gene9SEQIDNO-^73. 


264 
1788 


92 
89 


410 


W88745 


1 Homo sapiens 


" Secreted protein encoded by gene 30 clone 

HTSRV09. 


2004 


99 


411 


AB043953 


1 Mus 
musculus 


Chat-H 


""2628 


""82 


412 


Y86233 


Homo sapiens 


" Human secreted protein HNTMX29, SEQ ID 
NO:148. 


1014 


92 


413 

414 
415 


U10542 

API 55097 
G03203 


Pan 

troglodytes 
| Homo sapiens 
1 Homo sapiens 


MHC class I A 

NY-REN-7 antigen 
" Human secreted protein, SEQ ID "NO: 7284. 
" Human transmembrane protein HTMPI^-35. 


263 

850 

88 

246 


/ 1 

95 
48 
&9 


416 
417 
418 
419 


Y57911 
W27651 
Y76884 
AF255559 


Homo sapiens 
Homo sapiens 
Homo sapiens 
Notothenia 
coriiceps 


Secreted protein AT205. _ 

~ Retinoblastoma binding protein-7sequence. 
alpha tubulin 

Human secreted protein, SEO ID NO: 6005. 


481 

3077 
289 

" 209 


60 
87 
- 68 

74 


420 
421 


G01984 
AL109827 


Homo sapiens 
Homo sapiens 


" dJ309K202 (acrosomal protein ACR55 (similar 
to rat sperm antigen 4 (SPAG4))) 


" 1446 


96 1 


422 


AC008075 


Arabidopsis 
tfaaliana 


F24J5.4 


112 


35 



113 



WO 01/57188 PCT/US01/03800 



m 

ID 

NO: 


Accession 
No. 


Species 


Description 


Smith- 
Waterman 
Score 


Identity 


423 


AF231705 


Homo sapiens 


Alu co-repressor 1 


1090 


100 


424 


AF234887 


Homo sapiens 


FLAMINGO 1 


6268 


97 


425 


Y35942 


Homo sapiens 


Extended human secreted protein sequence, SEQ 
ID NO. 191. 


1961 


99 


426 


AB009288 


Homo sapiens 


N-copine 


635 


98 


427 


L12392 


Homo sapiens 


Huntington's Disease protein 


16080 


99 


428 


Y94990 


Homo sapiens 


Human secreted protein Yb21 1 , SEQ ID NO:20. 


768 


98 


429 


AJ293573 


Homo saniens 


zinc iingex protein Cezsnne 


542 




430 


Y84441 


Homo sapiens 


Amino acid sequence of a human RNA- 
associated protein. 


2074 


100 


431 


G02850 


Homo sapiens 


Human secreted protein, SEQ ID NO: 6931. 


723 


95 


432 


G04067 


Homo sapiens 


Human secreted protein, SEQ ID NO: 8 148. 


73 


42 


433 


AF159296 


Lycopersicon 
esculentum 


extensin-like protein 


613 


48 


434 


W48351 


Homo sapiens 


Human breast cancer related protein BCRB2. 


135 


44 


435 


X73874 


Homo sapiens 


phosphorylase kinase 


3442 


97 


436 


AF161426 


Homo sapiens 


HSPC308 


268 


74 


437 


Y30812 


Homo sapiens 


Human secreted protein encoded from gene 2. 


1055 " 


52 


438 


G03798 


Homo sapiens 


Human secreted protein, SEQ ID NO: 7879. 


168 


56 


439 


X14766 


Homo sapiens 


GABA-A receptor alpha 1 subunit 


2294 


96 


440 


X02344 


Homo sapiens 


beta-tubulm 


311 




441 


AP168418 


Homo sapiens 


activating signal cointegrator 1 


1882 


J IAS 


442 


LI 1672 


Homo sapiens 


zinc finger protein 


795 




443 


G03203 


Homo sapiens 


Human secreted protein, SEQ ID NO: 7284. 


93 


410 


444 


A52140 


unidentified 


HUMAN NDR 


2451 


100 


445 


X98330 


Homo sapiens 


ryanodine receptor 2 


9355 


yy 


446 


AF1I67I2 


Homo sapiens 


PR02738 "" ' " 


227 


49 


447 


AF245447 


Homo sapiens 


sphingosine kinase type 2 isoform 


J fO 


□a 
yy 


448 


AF133086 


Homo sapiens 


membrane-type serine protease 1 


2630 


94 


449 


U87305 


Rattus 
norvegicus 


transmembrane rvrftnfnr TINf^ill 


01 / 




450 


AF081249 


Homo sapiens 


JAW1 -related protein MR VII A long isoform 


4568 


99 


451 


AC005498 


Homo sapiens 


R31665 1 


JlO 


oz 


452 


M60235 


Homo sapiens 


&Tftnule membrane nrntftin-1 4fi 


not 


15 


453 


AB036706 


Homo sapiens 


intelectin 


730 


88 


454 


G00918 


Homo sapiens 


Human uerrptprf nrntem m "MfV AOOQ 


zoo 


81 


455 


Y22634 


Homo sapiens 


Human cytokine inducible regulatory protein-1 
(CIRP-1). 


192 


67 


456 


Y36705 


Homo sapiens 


Fragment of human secreted protein encoded by 
gene 62. 


106 


40 


457 


N91325 


Homo sapiens 


DNA encoding human growth hormone receptor. 


3282 


96 


458 


M19155 


Plasmodium 
falciparum 


S~ antigen precursor 


t in 


JO 


459 


Y13377 


Homo sapiens 


Amino acid sequence of protein PR0257. 


509 


98 


460 


Y02693 


Homo sapiens 


Human secreted nrotein encoded hv irenp dd 
clone HTDAD22. 






461 


Y14482 


Homo sapiens 


Fragment of human secreted protein encoded by 
gene 17. 


184 


54 


462 


Y53005 


Homo sapiens 


Human secreted nrotein clone cm 749 8 omtein 

• 4MMiwi w vvj vtvv* ^JVi^ilJ VJ WIJ v Will I W LHVlvJU 

sequence SEQ ID NO: 16. 


135 


Al ' 

**i 


463 


X84960 


Triticum 
aestivum 


low molecular weight glutenin 


109 


33 


464 


W19919 


Homo sapiens 


Human Ksr-1 (kinase suppressor of Ras) 


1781 


OJ 


465 


AF189764 


Mus 

musculus 


alpha/beta hydrolase-1 


502 


59 


"466 '" 


U93569 


Homo sapiens 


p40 


101 


30 


467 


Y41528 


Homo sapiens 


Fragment of human secreted protein encoded by 
gene 77. 


1172 


99 


468 


G02872 


Homo sapiens 


Human secreted protein, SEQ ID NO: 6953. 


149 


52 


469 


AJ000008 


Homo sapiens 


PI3-kinasc 


5832 


97 


470 


X70922 


Mus 

musculus 


neurotoxin homologue 


118 


47 


471 


G03797 


Homo sapiens 


Human secreted protein, SEQ ID NO: 7S78. 


198 


75 


472 


Y36705 


Homo sapiens 


Fragment of human secreted protein encoded by 


72 


57 



114 



WO 01/57188 



PCT/US01/03800 



SEQ 
ID 

NO: 


Accession 
No. 


Species 


Description 
gene 62. 


Smith- 

Waterman 

Score 


% 

Identity 


473 
474 


G02313 
Y07007 


Homo sapiens 
Homo sapiens 


Human secreted protein, oJoij lu inu. ow*- 

Breast cancer associated antigen precursor 
sequence. 


328 
1013 


100 
97 


475 


W93254 


Homo sapiens 


Human ESRP1 protein. 

Human breast cancer related protein BCRB2. 


943 
236 


80 
65 


476 
477 


W48351 
Y02693 


Homo sapiens 
Homo sapiens 


Human secreted protein encoded by gene 44 
clone HTDAD22. 


202 


60 


478 


G01870 


Homo sapiens 


Human secreted protein, ohki iu inv. jyji- 


267 


100 


479 


AF102777 


Mus 

musculus 


FYVE finger-containing phosphoinositide kinase 


3427 


92 


480 


G03052 


Homo sapiens 


Human secreted protein, any iu tvu. / a ->j. 


123 


53 


481 


W87701 


Homo sapiens 


A human membrane fusion protein aesignaieo 
SYTAX1. 


221 
131 


77 
39 


482 
483 


G03119 
AF2 10651 


Homo sapiens 
Homo sapiens 


Human secreted protein, ae*j tu invj. /zuu. 

NAG 18 


124 


59 


484 


AF010144 


Homo sapiens 


neuronal thread protein AU /c-in ir 


343 


50 


485 


G00637 


Homo sapiens 


Human secreted protein, SEQ ID NO: 47 18. 


129 


70 


486 


U15174 


Homo sapiens 


BCL2/adenovirus E1B 19kD -interacting protein 
3 


IdO 


73 


487 


Y76167 


Homo sapiens 


Human secreted protein encoded by gene 44. 




100 


488 


AJ275213 


Homo sapiens 


stabilin-1 




91 


489 


G03798 


Homo sapiens 


Human secreted protein, 5KQ 1U nu: >e /y. 


J 1 J 


65 


490 


L12392 


Homo sapiens 


Huntington's Disease protein 


16081 


100 




G03789 


Homo sapiens 


Human secreted protein, SEQ ID NO: 7870. 


197 





492 


J03799 


Homo sapiens 


laminin-binding protein 


228 


70 


493 


U15174 


Homo sapiens 


BCL2/adenovirusElB l9kD-interacting protein 
3 


12S 


41 


494 


Y02693 


Homo sapiens 


Human secreted protein encoded by gene 44 
clone HTOAD22. 


"T37 " 


'6l 


495 


AC005175 


Homo sapiens 


R31449 3 


889 


94 


496 


G03786 


Homo sapiens 


Human secreted protein, oEQ ID inu. >oo/. 


229 


61 


497 


AB030237 


Cants 
familiaris 


D4 dopamine receptor 


90 
228 


48 
_ 


498 
499 


G02872 
U70935 


Homo sapiens 

Peromyscus 

maniculatus 


Human secreted protein, otivj lu jnu. osoj. 
reverse transcriptase 


213 


52 


500 


U48508 


Homo sapiens 


skeletal muscle ryanodine receptor 


26406 


99 


501 


G03371 


Homo sapiens 


Human secreted protein, buy uJ wu. 


105 


58 


502 


AF119851 


Homo sapiens 


PR01722 


156 


62 


503 


AF1 13685 


Homo sapiens 


PRO0974 


116 


50 


504 


U79458 


Homo sapiens 


WW domain binding protein-2 


XII 


59 


505 


W29651 


Homo sapiens 


Human secreted protein Ui>l24_J. 


/>0R 


55 


506 


W85459 


Homo sapiens 


Secreted protein encoded by clone dhi i35_y. 


986 


70 


507 


Y86265 


Homo sapiens 


Human secreted protein HUSAJtt'/v, sni^ xu 
NO:180. 


1 15 


33 


508 


AL160175 


Homo sapiens 


bA243 J 16.3 (similar to MYLK. (myosin, iignt 
polypeptide kinase)) 


IRA 
1 o*t 


92 


509 


U43360 


Peromyscus 
maniculatus 


reverse transcriptase 


7 f . 


62 


510 


G03789 


Homo sapiens 


Human secreted protein, obi^ id inu. /o/u. 


117 


63 


511 


W79092 


Homo sapiens 


Human secreted protein dn740 3. 


1058 


100 


512 


AF010144 


Homo sapiens 


neuronal thread protein AU7 c-rs I r 


205 


64 


513 


AJ133439 


Homo sapiens 


GRIP1 protein 


2151 


100 






riwvinnht la 

melanogaster 


CG6393 gene product 


259 


42 . 


515 


Z17206 


Xenopus 
laevis 


P 46XlEg22 


128 


40 


516 


AF104413 


Homo sapiens 


large tumor suppressor 1 


1766 


94 


517 


G03797 


Homo sapiens 


Human secreted protein, SEQ ID NO: 787S. 


92 


40 


518 


AF151083 


Homo sapiens 


HSPC249 


444 


98 


519 


S80864 


Homo sapiens 


cytochrome c-like polypeptide 


318 


50 


520 


X92485 


Plasmodium 
vivax 


pval 


370 


61 



115 



WO 01/57188 
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SEQ 

ID 

NO: 


Accession 
No. 


Species 




Smith- 

MI 1 - 

Waterman 
Score 


% 

Identity 


521 


G03790 


Homo sapiens 


Human secreted nrotein RFO m Nn« ^ini 


1 <o 


59 


522 


AF121857 


Homo sapiens 


SOrtino nexin 7 




40 


523 


" Q02654 


Homo sapiens 


Human secretMi nmtrin ^FO XTS "KJO- 


oZ 


iff 

37 


524 


W88627 


Homo sapiens 


SecrrtftH nmtHn enrt\t\pA hv natia OA «i n na 
^w/icicu jnmciJl CULUUCCI Ujr gene 71 ClOnc 

HPMBQ32. 


in 

253 


73 


525 


AF119851 


Homo sapiens 


PRO 1722 




57 


526 


Y27761 


Homo sapiens 


Human cw.rMpxi nmtpin fTtwHpH hv tt&n* \Tr» An 
Anuuau jvvj&isaj pjuiwiu viivuucu oy gene jno. n • 


1 SA 

J 34 


57 


527 


G02707 


Homo sapiens 


Human c^r*t«H nmtoin ccr* rrv \ta. <noo 

^.w.„.. v L±J U/UO. 


70 


43 


528 


U47924 


Homo sapiens 


C8 


i i to 


--57 ■ 

50 


5i!> 


G04&3 


Homo sapiens 


Human «fif*retfiH nrntwin SPA W\ XIA. ill Ail 


QA 


45 


530 


G03203 


Homo sapiens 


Human secreted protein, SEQ ID NO: 7284. 


Hi 


60 


531 


G04067 


Homo sapiens 


Hlimnn QprrfipA nmt/*ln QUA ITlXW DU9 


92 


65 


532 


G03267 


Homo sapiens 


Human secreted protein, SEQ ID NO: 7348. 


75 


29 


533 


G03203 




n unian sccrexea protein, oI^kj ID NU: 7x84. 


182 


48 


534 


AF068286 


Homo Rflnipns 


mj\j]yiu j Or 


861 


100 


535 


U07707 


Homo sapiens 


epidermal growth factor receptor substrate 


228 


60 


536 


GO 1955 


Homo sapiens 


numan secreted protein, SEQ ID NO; 6036. 


484 


75 


537 


AF219232 


GaUus gallus 


qin-induced kinase 


206 


53 


538 


afhvp? 

r\T 1 J JUx«£ 


Homo sapiens 


mediator 


128 


100 


539 


G03267 


Homo sapiens 


Human secreted protein, SEQ ID NO: 7348. 


141 


5$ 


540 




Caenorhabditi 
s etcgans 


contains similarity to a BR-CYTtK domain 


853 


39 


541 


AC003093 


Homo sapiens 


OXYSTEROL-BINDENG PROTEIN; 45% 
similarity to P22059(PID:gl 29308) 


408 


66 


542 


M29487 


Homo sapiens 


integrin alpha subunit precursor 


517 


81 


543 


AT IUZ0.JU 


MUS 

musculus 


olfactory receptor F3 


327 


73 


544 




Homo sapiens 


Human secreted protein clone yb!86 1 protein 
sequence SEQ ID NO:84. 


386 


100 


545 


ADulrrojJ 


Pseudomonas 
aeruginosa 


probable TonB-dependent receptor 


279 


42 


546 


G03793 


Homo sapiens 


Human secreted protein, SEQ ID NO: 7874. 


264 


53 


547 


Y69192 


Homo sapiens 

— ; 


A human monocyte-macrophage apolipoprotein 
B receptor protein. 


1772 


67 


548 ■ 


Y91493 


Homo sapiens 


Human secreted protein sequence encoded by 
gene *m oJiy li) NU.loo. 


176 


100 


549 


GO 1571 


nuiuu sapiens 


Human secreted protein, SEQ ID NO: 5652 


777 


99 


550 


AF044588 


Homo sapiens 


protein regulating cytokinesis 1; PRC1 


1953 


88 


551 


Y29332 


Homo sapiens 


Human secreted protein clone pc584_2 protein 
sequence. 


1224 


94 


552 


X98330 


Homo sapiens 


ryanodine receptor 2 


24621 


99 


553 


Y42782 


numu adpicns 


Human uc Band #331 protein. 


684 


95 


554 


AB025258 


Mus 

muscultic 


granuphilin-a 


501 


41 


555 


AJ010346 






1468 


100 


556 


W92388 


Homo sapiens 


Human TR-interacting protein S239a. 


538 


92 


557 


At 119^51 


kiviusj dap I Clio 




175 


59 


558 


AFU7756 


Homo sapiens 


thyroid hormone receptor-associated protein 
complex component TRAP] 50 


183 


32 


559 


G02872 


nvmu sapiens 


Human secreted protein, SEQ ID NO: 6953. 


319 


68 


M) 




Mus 

ftfiicntltic 
lllUowUlUo 


Ca2+ dependent activator protein for secretion 


1010 


93 


561 


AF 187325 


Canis 
familiaris 


melanoma antigen 


287 


55 


562 




Homo sapiens 


OXA1L 


2512 


99 


563 


Z17238 


Rattus 
norvegicus 


glutamate receptor subtype dclta-1 


338 


66 


564 


W30638 


Homo sapiens 


Partial human 7-transmembrane receptor 
HAP0167 protein. 


371 


100 


565 


AC005620 


Homo sapiens 


R33590 1 


467 


97 


566 


Y99358 


Homo sapiens 


Human PR01772 (UNQ834) amino acid 
sequence SEQIDNO:63. 


1138 


78 


567 


AL031177 


Homo sapiens 


dJ889Ml 53 (novel protein) 


1002 


58 


"568 " 


AF151043 


Homo sapiens 


HSPC209 


798 


100 



116 



WO 01/57188 



PCT/US01/03800 



SEQ 

ID 1 

NO: 

569 


Accession 
No. 

AF097518 


Species ) 
lomo sapiens 


Description 

iver-specific transporter 


Smim- 

Waterman 1 

Score 

231 


Identity 
100 


570 
571 

572 
573 


AB035698 
Y07096 

AL031177 
Y66639 


Homo sapiens 
Homo sapiens 

Homo sapiens 
Homo sapiens 


Misshapen/NIK-related kinase MINK-1 
Colon cancer associated antigen precursor 
sequence. 

dI889M153 (novel protein) 

Membrane-bound protein PRO290. 


1532 
1064 

735 
254 


100 
100 

55 j 
45 1 
99 J 


574 
575 
576 
577 
578 


AB037108 

D43949 

Y48596 

G00352 

R95913 


Homo sapiens 
Homo sapiens 
Homo sapiens 
Homo sapiens 
Homo sapiens _ 


seven transmembrane domain orphan receptor 
This gene is novel 

Human breast tumour-associated protein 57. 

1 ™ j ___x_;_ OTTO TV\ \JO- 

Human secreted protein, SfcQ iu nu. ^jj. 

Neural thread protein. 


1883 
836 

108 

141 
140 


100 1 

50 

75 
65 


579 
580 

581 


AK025116 
Y86473 

AF196779 


Homo sapiens 
Homo sapiens 

Homo sapiens 


unnamed protein product 

Human gene 52-encoded protein fragment, SEQ 

E>NO:388. 

JM10 protein 


201 
77 

450 


70 

70 | 
100 


582 
583 


AF 1 88706 
AB030234 


Homo sapiens 

Cams 

famiiiaris 


g20 protein 

D4 dopamine receptor 

Human secreted orotein. SEO ID NO: 6702. 


330 
64 

345 


98 j 
56 

90 | 


584 
585 

586 


AL096828 
Y30819 


Homo sapiens 
Homo sapiens 


dJ963E22.1 (Novel protein similar to NY-REN-2 

Human secreted protein encoded from gene 9. 
Wum»n "secreted Drotein. SEO ID NO: 4438. 


268 

235 
132 


"85 

35 
56 


587 
588 
589 


G00357 
G02872 
AF235017 


Homo sapiens 
Homo sapiens 
Mus 

muscuhis 


Human secreted.protein, SEQ ID NO: 6953. 
2P1 protein 


182 
764 


79 1 
80 


590 


W88627 


Homo sapiens 


Secreted protein encoded by gene 94 clone 
HPMBQ32. 


329 


81 


591 


Y30709 


Homo sapiens 


Amino acid sequence of a human secreted 
protein. 


110 


43 | 


592 


Y53875 


Homo sapiens 


A human seven transmembrane signal transducer 
polypeptide. 


1369 


92 


593 
594 


Y53051 
Y27658 


Homo sapiens 
Homo sapiens 


Human secreted protein clone ddll9_4 protein 

sequence SEQ ID NO:108. 

Human secreted protein encoded by gene No. 92. 


1112 

763 


97 j 
79 j 


595 
596 


G03798 
AF151110 


Homo sapiens 
Mus 

muscuhis 


Human secreted protein, SEQ ID NO: 7879. 
COP I protein 

Human secreted protein, SliQ ID NO: 7867. 


156 
2215 

157 


58 | 
"95 

65 


597 
598" 

599 
600* 


G03786 
AF192499 

AF 119855 
G02872 


Homo sapiens 
Mus 

musculus 
Homo sapiens 
Homo sapiens 


putative secreted protein ZSIG37 
PR01847 

Human secreted protein, SbQ ID NO: 6953. 
Human secreted protein encoded by gene 38. 


143 

236 
212 
567 


40 

76 1 

73 

88 


601 

OVJZ 

603 
604 

605 
606 


Y00295 
AF184971 
AF061936 
AL096828 

AB033106 
X75756 


Homo sapiens^ 
Homo sapiens 
Homo sapiens 
Homo sapiens 

Homo sapiens 
Homo sapiens 


class II cytokine receptor ZCYTOR7 
diacylglycerol kinase iota 
" CU9&E22.1 (Novel protein similar to NY-RbN-2 
Antigen) 

KIAA1280 protein 
protein kinase C mu 


" 2015 
773 
1333 

3915 
3916 


74 
96 
93 

100 j 
99 j 


ct\n 
OU7 

608 

609 


JJo070<> 

W69341 
W88627 


Hrunn <! aniens 

Homo sapiens 
Homo sapiens 


similar to D.melanogaster peroxidasin(Ul 1052) 
Secreted protein of clone CG279 1. 
Secreted protein encoded by gene 94 clone 
HPMBQ32. 


5758 
1377 
339 


99 j 
99 1 
"~82 


610 
611 


Y27868 
AF202636 


Homo sapiens 
Homo sapiens 


Human secreted protein encoded by gene No. 
107. 

angiopoietin-like protein PP1 158 


116 
2164 


62 

100 I 


612 
613 


AF090944 


Homo sapiens 
Homo sapiens 


PRO0663 

Human secreted protein encoded by gene 44 
clone HTDAD22. 


218 
195 


~ 82 ! 
59 


614 

615 
616 


M87053 

" ACOOii32 
G01984 


Rattus 
norvegicus 
Homo sapiens 
Homo sapiens 


lens membrane protein 
FPM315 

"Human secreted protein, SEQ ID NO: 6065. 


450 

163 
205 


84 

37 
79 



117 



WO 01/57188 PCT7US01/03800 



~SEQ 
ID 

NO: 


Accession 
No. 


Species 


Description 


Smith- 
Waterman 
Score 


Identity 


617 


Y91524 


Homo sapiens 


Human secreted protein sequence encoded by 
gene74SEQIDNO:197. 


821 


99 


618 


AJ24567.1 


Homo sapiens 


CTL2 protein 


2258 


99 


619 


Y76198 


Homo sapiens 


Human secreted protein encoded by gene 75. 


108 


64 


620 


* AF067864 


Homo sapiens 


transferrin receptor 2 alpha 


3922 


94 


621 


D90721 


Escherichia 
coJi 


Transmembrane protein dppC 


573 


90 


622 


W75858 


Homo sapiens 


Human secretory nrotein nf nlnnr. G s 752-'' 


730 


1 00 


623 


Y94982 


Homo sapiens 


Human secreted protein vb!2 1,SEQ1DN0'4. 


733 


100 


624 


Af 034745 


Mus 

musculus 


LNXp80 n 


637 


83 


625 


U42580 


Paramecium 
bursaria 
Chlorella 
virus 1 


Pro-rich, IP PPNMSLPLS (3x) 


"94 


46 












626 


U79260 


Homo sapiens 


unknown 


194 


70 


627 


R95913 


Homo sapiens 


Neural thread protein. 


99 


50 


628 


G03450 


Homo sapiens 


Human secreted protein, SEQ ID NO: 753 1 


427 


100 


629 


Y36281 


Homo sapiens 


Human secreted protein encoded by gene 58. 


590 


inn 


630 


Y02693 


Homo sapiens 


Human secreted protein encoded by gene 44 
clone HTDAD21 


165 


76 


631 


G02139 


Homo sapiens 


Human secreted protein, SEQ ID NO: 6220. 


268 


96 


632 


U16996 


Homo sapiens 


protein tyrosine posphafase 


"351 




633 


AF121857 


Homo sapiens 


sorting nexin 7 


2019 


inn 


634 


AF283772 


Homo sapiens 


similar to Homo sapiens ribosomal protein L10 
encoded by GenBank Accession Number 
L25899 


340 


77 


635 


Y07090 


Homo sapiens 


Renal cancer associated antigen precursor 
sequence. 


777 

£.1 1 


o*f 


636 


AB013382 


Homo sapiens 


DUSP6 


414 


/O 


637 


G02872 


Homo sapiens 


Human secreted protein, SEQ ID NO: 6953. 


315 


71 


638 


M95762 


Rattus 
norvegicus 


GABA transporter 


924 


89 


639 


G03789 


Homo sapiens 


Human secreted protein, SEQ ID NO: 7870 


219 


\J\J 


640 


Y01400 


Homo sapiens 


Secreted protein encoded by gene 1 8 clone 
HNHF029. 


137 


79 


641 


AC008075 


Arabidopsis 
thaliana 


F24J5.4 


121 


33 


642 


W74824 


Homo sapiens 


Human secreted protein encoded by gene 96 
clone HAQBK61. 


615 


62 


643 


AB015982 


Homo sapiens 


serine/threonine kinase 


485 


98 


644 


Y25806 


Homo sapiens 


Human secreted protein fragment encoded from 
gene 23. 


162 


46 


645 


AF122904 


Homo sapiens 


membrane protein DAP 10 


474 


100 


646 


AF233323 


Homo sapiens 


Fas-associated phosphatase- 1 


loo 


38 


647 


W48804 


Homo sapiens 


Homo sapiens clone BK1 58 I protein. 


1203 


99 


648 


AF257330 


Homo sapiens 


COBW-like protein 


1440 


98 


649 


Y36203 


Homo sapiens 


Human secreted protein #75. 


233 


73 


650 


6oi8>2 


Homo sapiens 


Human secreted protein, SEQ ID NO: 6953. 


173 


78 


651 


Y32199 


Homo sapiens 


Human receptor molecule (REC) encoded by 
Incyte clone 2022379. 


1032 


100 


652 


AB032909 


Hyiobates 
agilis 


dopamine receptor D4 


122 


32 


653 


AK021848 


Homo sapiens 


unnamed protein product 


186 


69 - 


654 


W73411 


Homo sapiens 


Human secreted protein encoded by Gene No. 
15. 


57 


37 


655 


L22455 1 


Rattus 
norvegicus 


mu opioid receptor 


116 


34 


656 


G03112 


Homo sapiens 


Human secreted protein, SEQ ID NO: 71 93. ! 


110 


45 


657 


G02345 


Homo sapiens 


Human secreted protein, SEQ ID NO: 6426. 


459 


97 


658 


W88627 


Homo sapiens 


Secreted protein encoded by gene 94 clone 
HPMBQ32. 


291 


75 


659 


G02832 


Homo sapiens 


Human secreted protein, SEQ ID NO: 6913, 


134 


65 


660 


Y91423 


Homo sapiens 


Human secreted protein sequence encoded by 
genellSEQIDNO:144. 


333 


96 



118 



WO 01757188 



PCT7US01/03800 



SEQ 

ID 1 
NO: 


\ccession 
Ma 


Species 1 


Description 1 ; 

1 

Human secreted protein, SEO ID NO: 787U. T 


Smith- "~P 
Waterman ] 
Score 
168 


Identity 
58 


661 

662 - " 


G03789 
V53886 


Homo sapiens 
orao sapiens 


A suppressor of cytokine signalling protein 
desienatedHSCOP-6. 


375 
629 


43 
100 


663 
664 

665 


W75771 

AT nQ£T7fl 

AB037734 


Komo sapiens 
Homo sapiens 


Human GTP binding protein APD0B. | 
&A150A6.2 (novel 7 transmembrane receptor T 
(rhodopsin family] (olfactory receptor like) 

protein (hs6Ml-21)) L 

K1AA1 3 13 protein 1 
Human cerebral protein- 1. I 


480 

978 
192 


55 

96 
84 


666 
667 
668 


W82841 
AB030184 


Homo sapiens 

VTnmn ^aniens 

Mus 

musculus 


Human cerebral protein-1. J 
contains transmembrane (TM) region and ATP j 
binding region 1 


182 
757 


87 
68 


669 


AB032919 


Hylobates 
muelleri 


dopamine receptor D4 


85 


37 


07U 

671 
672 
673 


Z33642 
W85608 
G03203 


Partus 
norvcgicus 
Homo sapiens 
Homo sapiens 
Homo sapiens 


outer membrane protein 

leukocyte surface protein 1 

Secreted protein clone du410 f _5. _ 1 

Human secreted protein, SliQ ID NO: 7284. 


746 
394 

261 ! 
106 


81 

93 
91 
48 


674 
675 
676 
677 


AL035587 

G03797 
AF02^954 


Homo sapiens 

; iomo sapiens 
Bos taurus 


dI475N16.4(K!AA0240) . 1 

Secreted protein 108-005-5-o-U-FL. 1 

Human secreted protein, SEQ ID NO: 7*7«. 
pyruvate dehydrogenase phosphatase regulatory j 
subunit precursor; PDPr | 


2388 
1134 
174 
1013 


99 
53 
74 
95 


678 


L11625 


Mus 

musculus 


receptor protein-tyrosine kinase 


545 
745 


"96 

Too 


679 
680 

681 


AL031427 
AJ133430 

G02532 


Homo sapiens 
Mus 

musculus 
Homo sapiens 


dJ167A19J (novel protein) 

olfactory receptor 1 

Human secreted protein, SEQ ID NO: 6613. 


179 


77 
70 


682 
683 


CjUi toy 


riuiuu a«|/iwiia 

Homo sapiens 


Human secreted protein, SEQ ID NO; 787U. 
Human secreted protein clone ytl4_l protein 
sequence SEQ ID NO:92. 


336 
118 


76 
100 


684 
685 


U43360 
G00885 


Peromyscus 
Homo sapiens 


reverse transcriptase 

Human secreted protein, SEQ lDNO: 4966. 


too 

162 
590 


37 

60 
100 


686 
687 
688 


AK001518 

G01982 

Y92241 


Homo sapiens 
Homo sapiens 
Homo sapiens 


unnamed protein product 
Human secreted protein, SEQ ID NO: 6063. 
Human cancer associated antigen precursor 
• (MO-REN-46). . 


718 
2405 


100 
99 


689 


AC024792 


Caenorhabditi 
s elegans 


contains similarity to TRP78316 


423 


36 


690 


Y27868 


Homo sapiens 


Human secreted protein encoded by gene No. 
107. 


183 


81 


691 

692 
693 


Y56514 

Y27795 
Y36268 


Homo sapiens 

Homo sapiens 
Homo sapiens 


Human Jurkat cell clone P2-15 AIM10 longest 
ORF protein sequence. 

Human secreted protein encoded by gene No. 79. 
Human secreted protein encoded by gene 45. 


180 

1539 

428 

308 


88 

99 
98 
89 


695 
696 
697 


Y45272 

AF191838 

Y02693 


Homo saoiens 
Homo sapiens 
Homo sapiens 
Homo sapiens 


ribosomal protein L35 

Human secreted protein encoded from gene 16. 
TANK binding kinase TBK1 
Human secreted protein encoded by gene 44 
clone HTDAD22. _ 


1517 
TT242 
275 


" 99 
98 
75 


698 


Y87280 


Homo sapiens 


Human signal peptide containing protein HSPP- 
57 SRQ ID NO:57. 


576 


90 

"99 


699 
700 


Y97999 
AJ006701 


Homo sapiens 
Homo sapiens 


' riuman SCAD family molecule HSFM-1, SEQ 
IDNO:l. 

putative serine/threonine protein kinase 


729 
610 


79 


701 
702 


AF209198 
AJ298841 


Homo sapiens 
Mus 

musculus 


zinc finger protein 277 
torsinA protein 


| 2357 
709 

622 " 


100 
45 

""98 


703 
704 

705 


AK021729 
Z46787 

1302882 


Homo sapiens 
Caenorhabditi 
s elegans 


unnamed protein product . 

similar to Glutaredoxin, Zinc finger, C3HC4 
type (RING finger) 
~ Human secreted protein, &Q ID NO: 6963. 


920 
"1 &9 


51 
9fe 



119 



WO 01/57188 



PCT/US01/03800 



NO: 



Accession 
No. 



G02501 
RS532T 



Description 



Smith- 
Waterman 
Score 



% 

Identity 



707 



Homo sapiens 



Homo sapiens 



Human secreted protein, SEQ ID NO: 6582. 
Tumor necrosis factor receptor 1 death domain 
jjgand (clone 2DD). 



125 



121 



58 



95 



709 



Y962Q2 
M63577 



Homo sapiens 



Homo sapiens 



Human secreted protein, SEQ ID NO: 7083. 



IkappaB kinase 



(IKK) binding protein, Y2H56. 



125 



516 



39 



98 



ITT 



712 
713 



AB026291 



Saccharomyc 
es cerevisiae 



131 



D212U 



Rattus 
norvegicus 



Homo sapiens 



acetoacetyl-CnA synthetase 



4*7 



protein tyrosine phosphatase (PTP-BA&, type W 



Marmota 
mannota 



olfactory receptor 



615 



83 



714 
"7TT 



"716" 



717 
718 



ABO33062 



G00577 



Homo sapiens 
Homo sapiens 



Y96864 
AJ243396 



Homo sapiens 



Human secreted protein, SEQ I D NO; 7642. 
KIAA1236 protein" 



Homo sapiens 



Human secreted protein, SEQ ID NO: 4658. 



SEQ. ID. 37 from WO0034474. 



251 



1380 
80 



835 



100 



100 



73 



Homo sapiens 
Homo sapiens 



voltage-gated sodium channel faeta-3 subunit 



234 



100 



720 



721 



722 
723 



AB020598 



Y53886 



Homo sapiens 
Homo sapiens 



similar to chicken gamma aminobutyric acid 
receptor beta4 subunit 



578 



peptide transporter 3 



J05046 
AF001958 



Homo sapiens 
Ambystoma 



A suppressor of cytokine signalling protein 

designated HSCOP-6. 

insulin receptor-related receptor 



1096 



570 



678T 



99 



100 



74 



100 



tlgrinum 



electrogenic Na+ bicarbonate cotransporter, 
NBC 



111 



41 



Mus 
musculus 
Homo sapiens 



semaphorin cytoplasmic domain-associated 
protein 3A 



5253 



w 



726 
ITT 



728 



729 
T30" 



731 



AF016191 



AB029559 



Rattus 
norvegicus 



GABA transporter 



potflssium channel 



3114 



Y28503 
AJ011415 



Rattus 
norvegicus 



BATI 



Z93096 



Homo sapiens 
Homo sapiens 



HGFH3 Human Growth Factor Homologue 3, 



Homo sapiens 



plexin-Bl/SEP receptor 



210062 



Homo sapiens 



DK390B3.1 (manic fringe (Drosophila) 
homolog) 



cDNA encoding a human vanilloid receptor 
homologue Vanilrepl. 



370 



139 



2186 



729 



142 



675 



99 



100 



35 



97 



56 



68 



732 
733 



734 



AFI61382 

Abowoft 



AE000493 



Homo sapiens 
Homo sapiens 



Escherichia 

coli 

Homo sapiens 



HSPC264 
KIAA1 110 protein 



492 



putative transport protein 



3826 



592 



99 



97 



736 



737 



738 
739 



AF132599 



X55019 



X91906 
AB026116 



Homo sapiens 



dJ41 7022,2 (novel 7 transmembrane receptor 
(rhodopsin family) protein similar to high- 

acid receptor homolog) 



2173 



RANTES factor of late activated T lymphocytes- 



245 



Homo sapienT 
Homo sapiens 



acetylcholine receptor delta subunit 



voltage-gated chloride ion channel 



883 



1978 



99 



56 



99- 



740 



741 



742 
~74T 



D00570 



Homo sapiens 



V703626 



U66059 
AF 119815 



Mus 
musculus 
Homo sapiens 



organic anion transporter 4 



open reading frame (196 AA) 



Homo sapiens 
Homo sapiens 



Human thyrotropin QPRN-terminal sequence. 



Vjegment translation product 



1444 



83 



118 

614 



98 



24 



40 



744 
r 74T 



746 



747 



748 



XI666T 
W67838 



Homo sapiens 



Homo sapiens 



W57260 
W2T57T 



Homo sapiens 



Homo sapiens 



G-protein<oupled receptor 



haematopoietic lineage cell protein (AA 1-486) 



Human secreted protein encoded by gene 32 
clone HLTCJ63. 



Human semaphorin Y. 



Alzheimer's disease protein encoded by DNA 
from plasmid pGCS2232. 



2751 



148 



448 



968 



99 



93 



95 



65 



750 



Homo sapiens 



AL022238 
G03889 



riomo sapiens 



Human secreted protein cione yd218J protein 
sequence SEQ ID NO:76. 



622 



Homo sapiens 



d)l(M2K\0.5 (novel protein) 



Human secreted protein, SEQ ID NO: 7970. 



314 



391 



100 



85 



87 



120 



PCT/US01/03800 

WO 01/57188 



ID 

NO: 



Accession 
No. 



Species 



Description 



Smith- 
Waterman 
Score 
773 



% 

Identity 



751 



752 
753 



AB025258 



Mus 

musculus 



granuphiluH 



Homo sapiens 



Human transmembrane protein HP02000. 



900 
2527 



99 
99 



754 



75T 



Y48586 



Homo sapiens 



Human breast tumour-associated protein 47. 



AJ272207 



M85183 



Homo sapiens 



Rattus 
norvegicus 



putettve G protein-coupled receptor 92 



694 



vasopressin receptor 



979 



100 



68 



756" 



757 



758 
759 



AF190501 



Homo sapiens 



Y02692 



Homo sapiens 



leucine-rich repeat-containing G protein-coupled 
receptor 6 



Human secreted protein encoded by gene 4i 



clone HTADX17. 



Homo sapiens ALK-3 



461 



439 
564 



87 



2L 
97 



760 



761 



762 



R04932 



W74902 



GO3706 



Homo sapiens 



Interferon-gamma receptor segment from clone 
39 responsiblefor binding the target 



Homo sapiens 



AB020676 



Homo sapiens 



Homo sapiens 



Homo s 



Human secreted protein encoded by gene 175 

clone HE8BI92, 

Human secreted protein, SECj ID NO: 7787. 



1217 



KIAA0869 protein" 



223 
443T 



unnamed protein product 



2285 
573 



99 



99 
99 



764 
765 



AF173358 



Homo sapiens 



glucocorticoid receptor AF-1 coactivator-l 



AF268066 



Mus 
musculus 



netrin 4 



89 
89" 



76T" 



767 



768 



Y48585 



AF230378 



Homo sapiens 



Mus 
musculus 



Human breast tumour-associated protein 46. 
intcrleukin-1 delta 



309 



AF121975 



Mus 

musculus 



odorant receptor S18 



268 



45 
6T 



769 



770 



771 



772 



773 



774 



AB008515 



Homo sapiens 



RanBPM 



Y09945 



Rattus 
norvegicus 



putative integral membrane transport protein 



AF226731 



Y27132 



Homo sapiens 
Homo sapiens 



AD026 

Human glioblastoma-derived polypeptide (clone 
QA004FG). 



X87832 



Homo sapiens 



NOV/plexin-Al protein 



458 



T38T 



AB025258 



Mus 

musculus 



granuphilin-a 



1821 
500 



50 



99~ 

w 



98 



41 



775 



AF125101 



Homo sapiens 



HSPC040 prote 



Homo sapiens 



Human secreted protein, SBQ IP NO: 6896. 



232 
314 



93 
95 



G02493 



Homo sapiens 



Human secreted protein, SEQ IP NO: 6574 



191 
2U~ 



68 
4T 



778 



779 
780 



R03301 



Homo sapiens 



AL357374 



Homo sapiens 



Sequence of pre-human atrial natriuretic peptide. 
bA353C18.2 (novel protein) 



232 
1434" 



100 
89 



781 



AF100346 



Homo sapiens 



Y19566 



Homo sapiens 



neuronal voltage gated calcium channel gamma- 

3 subunit , 

Amino acid sequence of a human secreted 



103 



protein. 



Y36233 



Homo sapiens 



Human secreted protein encoded by gene 1U. 



1098 
141 



52 



93 
30 



783 



784 



785 



786 



787 
788 



AF084464 



Rattus 
norvegicus 



GTP-binding protein REM2 



W49042 



Homo sapiens 



Human low density lipoprotein binding protein 
LBP-3. 



2693 



AF238381 



Homo sapiens 



PTOV1 



Y91870 



Homo sapiens 
Homo sapiens 



Human apoptosis related protein. 



Human membrane transport protein, MTRF-7. 



1904 



547 



1062 
8684 



99 



91 



100 



94 
98 



AF1 17754 



Homo sapiens 



thyroid hormone receptor-associated protein 
complex component TRAP240 




121 



WO 01/57188 PCT/US01/03800 



SEQ 
ID 

NO: 


Accession 
No. 


Species 


Description 


Smith- 
Waterman 
Score 


6 A 

Identity 


797 


AF258340 


Homo sapiens 


hepatocellular carcinoma-associated antigen 1 12 


1151 


99 


798 


AF159615 


Homo sapiens 


FGF receptor activating protein 1 


461 


98 


799 


Y59863 


Homo sapiens 


Human normal uterus tissue derived protein 26. 


797 


99 


800 


W70459 


Homo sapiens 


Human Tl -receptor ligand m splice variant 2. 


572 


92 


801 


L00073 


Homo sapiens 


renin 


1913 


93 


802 


P92219 


Homo sapiens 
(human) 


CR1 protein. 


1 1963 


97 


. 803 


X15357 


Homo sapiens 


ANP-A receptor preprotein (A A -32 to 1029) 


5109 


AO 


804 


W64473 


Homo sapiens 


Human secreted protein from clone EC172 1. 


4018 


95 


805 


AJ243874 


Homo sapiens 


oligoptrenin-4 


2067 


100 


806 


GO! 731 


Homo sapiens 


Human secreted protein, SEQ ID NO: 5812. 


284 


100 


807 


Z24680 


Homo sapiens 


garp 


1562 


83 


808 


AF171669 


Homo sapiens 


glycoprotein-associated amino acid transporter 
LAT2 


1364 


90 


809 


W70321 


Homo sapiens 


Secreted protein CC198J. 


1154 


96 


810 


W74843 


Homo sapiens 


Human secreted protein encoded by gene 1 15 
clone HOVBA03. 


855 


99 


811 


AF108831 


Homo sapiens 


K:C1 cotransporter 3 


4561 


100 


812 


AF092135 


Homo sapiens 


PTD014 


862 


100 


813 


AF283772 


Homo sapiens 


similar to Homo sapiens ribosomal protein HO 
encoded by GenBank Accession Number 
L25899 


784 


100 


814 


G01563 


Homo sapiens 


Human secreted protein, SEQ ID NO: 5644. 


330 


100 


815 


AF051151 


Homo sapiens 


Toll/interleukin-1 receptor-like protein 3 


3850 


99 


816 


W95630 


Homo sapiens 


Homo sapiens secreted protein gene clone 
galMJ. 


358 


100 


817 


GOI082 


Homo sapiens 


Human secreted protein, SEQ ID NO: 5163. 


549 


100 


818 


AF151800 


Homo sapiens 


CGI-41 protein 


1106 


95 


819 


L00352 


Homo sapiens 


low density lipoprotein receptor 


3980 


inn 


820 


X04434 


Homo sapiens 


16F-I receptor 


5832 


9ir 


821 


G03844 


Jlomo sapiens 


Human secreted protein, SEQ ID NO: 7925. 


572 




822 


AF212220 


Homo sapiens 


TERA 


396 


48 


823 


Y50125 


Homo sapiens 


Human ElvcoDhosDhatidvlinositol-nnchored 

*********** Ov w r IWJ r J wuvjhvi luiwivi vu 

protein GPI-122. 


4897 


00 


824 


AF 156778 


Homo sapiens 


ASB-3 protein 


2675 




825 


AF096322 


Homo sapiens 


neuronal voltage-gated calcium channel gamma- 
2subunit 


1105 


100 


826 


Y07972 


Homo sapiens 


Human secreted protein fragment #2 encoded 
from gene 28. 


1540 


100 


827 


AB032013 


Homo sapiens 


potassium channel Kv8.1 1 


2435 


95 | 


828 


Y13620 


Homo sapiens 


BCL9 


5284 


96 


829 


Y91474 


Homo sapiens 


Human secreted protein sequence encoded by 
gene24SEQIDNO:147. 


541 


98 


830 


X54232 


Homo sapiens 


glypican 


1625 


87 


831 


X14830 


Homo sapiens 


acetylcholine receptor beta-subunit preprotein 


2540 


100 


832 


Y71262 . 


Homo sapiens 


Human chondromodulin-like protein, Zchml. 


1002 


100 


833 


G03873 


Homo sapiens 


Human secreted protein, SEQ ID NO: 7954. 


638 


96 


834 


AC003030 | 


Homo sapiens 


R29828 1 


1389 


93 


835 


Y38422 


Homo sapiens 


Human secreted protein. 


964 


87 


836 


U41557 


Caenorhabditi 
s elegans 


glycine-rich 


85 


36 


837 


AL121889 


Homo sapiens 


dJ1076E17.1 (KIAA0823 protein (continues in 
AL023803)) 


998 


75 


838 


AJ011415 


Homo sapiens 


plexin-Bl/SEP receptor 


1580 


60 


839 


W80398 


Homo sapiens 


A secreted protein encoded by clone cwl 543 3. 


1105 


67 


840 


G00862 


Homo sapiens 


Human secreted protein, SEQ ID NO: 4943. 


255 


92 


841 


G02650 


Homo sapiens 


Human secreted protein, SEQ ID NO: 673 1 . 


644 


97 


842 


AF036717 


Homo sapiens 


FGFR signalling adaptor SNT- 1 j 


2629 


99 


843 


Y73446 


Homo sapiens 


Human secreted protein clone yc27 1 protein 
sequence SEQ ID NO: 1 14. 


1089 


100 


844 


G02872 


Homo sapiens 


Human secreted protein, SEQ ID NO: 6953. 


357 


69 


845 


AF151810 


Homo sapiens 


U3I-52 protein 


3443 


88 


846 


X83378 


Homo sapiens 


putative chloride channel 


1620 


99 


847 


AC004883 


Homo sapiens 


similar to general transcription factor 21; similar 


655 


96 



122 



WO 01757188 



PCT/US01/03800 



ID 


Accession 
MO. 


Species 


Description 

to AF038969 (PID:g2827207) 


Smith- [~ 

Waterman 

Score 


Vo 

Identity 


848 
849 


X99886 

a r , An^<is*7 
AU)Ujjo / 


Homo sapiens 

. riuiiiU sapiens 


monocyte chemotactic protein-2 

similar to mouse olfactory receptor 13; similar to 

P34984 (PID:g464305) 


160 
963 


76 
98 


o3U 




Urtmn canienQ 


G protein-coupled receptor C5L2 J_ 


1767 


100 


Q<1 

oM 




Homo Q&riiens 


ARF GTPase-activating protein GIT1 


3415 


98 


C<0 


VRfi017 
X 0O«t / 


Homo ^aniens 


Human secreted protein HWHGU54, SEQ ID 
NO: 132. 


1189 


99 


ojj 




Homo sapiens 


chloride channel protein 7 ! 


3748 


99 


o54 






furin (AA 1-794) 


3550 


99 


855 


W78245 


Homo sapiens 


Fragment of human secreted protein encoded by 
gene 19. 


1245 


99 


OJO 






Interleukin-2 receptor associated protein p43. 


1926 


100 


857 


VA 1 
1*11 /OD 


nonio sapituo 


Human PRO1083 protein sequence. 


3211 


99 


858 


AT?A«T5A/i 

AtUj /jUo 


UAmA oonifn? 


transmembrane proteolipid 


481 


84 


859 




riomo sapiens 


nnnnmftri nrotein nroduct 


374 


69 


860 


Y41312 


Homo sapiens 


Human secreted protein encoded by gene 5 clone 
HI DRM43. 


824 


100 


862 


Y 25770 


U/"im/\ canrPflC 


Human secreted protein encoded from gene 66. 


895 


"99 


863 


Y74188 


Homo sapiens 


Human prostate tumor EST fragment derived 
nrofeln #375. 




30 


864 


At Io/4/j 


Unmn caniptK! 
rlUIUU bapiCiio 


heme-bindinc nrotein 


870 


"99 


oo3 




U>\mA sanip.nc 


Human secreted protein, SEQ ID NO: 6613. 


211 


67 


OOP 


A34o7U 


UnmA canipnc 


Type II integral membrane protein 


1201 


100 


867 


uUU /lAi 


rlOiTUi aajjiciia 


Human secreted protein, SEQ ID NO: 4781 . 


640 


99 


868 


YU7©>4 


xiouiv Sapiens 


Human secreted protein fragment encoded from 
gene 43. 


388 ; 


88 


o/;q 




Wrimn ^ aniens 


preproenkephalin ( 


1349 


95 


870 


Y91632 


Homo sapiens 


Human secreted protein sequence encoded by 
gene 25 SEQ ID NO:305. 


1048 


98 


871 


L04311 


Homo sapiens 


GABA-alpha receptor beta-3 subunit 


237 


93 


872 




XXr\rrv/A ffQnf PtIC 


Human cytokine family member EF-7 protein. 


960 


94 




Api/CiaRO 


null lv dajSicJO 


HSPC264 


1124 


99 


874 


UlU41-£ 


UnmA nnipnc 

nomo sapicua . 


Human secreted protein, SEQ ID NO: 7493. 


464 


100 


875 




T-Ts-\m/\ eatMpnc 


Human secreted orotein encoded by gene No. 6. 


573 


96 


o7t> 


ivli jOJU 


Unmn ^nnif*n<» 
nuillv/ SKL^llvUa 


B-ccll growth factor 


171 


56 


877 


W63681 


Homo sapiens 


Human secreted protein 1. 


1652 


99 


878 


Ju£7do7 


Rattus 

nrtrVP'Oi'rjlQ 


n eiinexnnhi lin 


1448 


98 




I lUOJ J 




Amino acid sequence of a human secreted 
protein. 


321 


100 


880 


W88991 


Homo sapiens 


Polypeptide fragment encoded by gene 144. 


936 


100 


SSI 


AtillfilftO 




orphan G protein-coupled receptor 


1971 


100 


ceo 


ARWIflRfft 


Uftmn ^aniens 


EDRF 


528 


100 


S8** 
OOJ 




Unmn <aniftfi^ 


cathepsin L 


209 


72 


OO/l 

oo*t 


I 


Krttno sarsiens 

J 1 kj ill w OiWlvll0 


Human secreted protein clone dhl073 J2 protein 
sequence SEQ ID NO:106. 


348 


100 




AF070661 


Homo sapiens 


HSPC005 


404 


100 


886 


Y04315 


Homo sapiens 


Human secreted protein encoded by gene 23. 


385 


100 




X92744 


Homo sapiens 


hBD-1 


375 


100 


Ooo 


Y224Qf> 


Homo sapiens 


Human secreted protein sequence clone 
cn621 8. 


994 


94 


ftRQ 




Homo saoiens 


Human soluble protein ZTMPO-1. 


4595 • 


99 


Hon 


G03714 


Homo sapiens 


Human secreted protein, SEQ ID NO: 7795, 


147 


63 


891 


' AF208856 


Homo sapiens 


BM-014 


1012 


99 


892 


U29195 


Homo sapiens 


neuronal pentraxin II 


2002 


98 


893 


X68149 


Homo sapiens 


Burkftl lymphoma receptor 1 


1953 


100 


894 


Y94914 


Homo sapiens 


Human secreted protein clone pw337_6 protein 
sequence SEQ ID NO 34. 


537 


100 


895 


W61630 


Homo sapiens 


Clone HNFGW06 of EGFR receptor family. 


326 


63 


896 


M24110 


Homo sapiens 


GOS19-2 peptide precursor 


481 


100 


897 


Z68747 


Homo sapiens 


imo^en 38 


2018 


"55 


898 


AF1861I2 


Homo sapiens 


neurokinin B-iike protein 2NEUROK1 


619 


100 


899 


AF225420 


Homo sapiens 


AD025 


734 


ion 



123 
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SEQ 
ID 

NO: 


No. 


Species 


uescnpuon 


Smith- 
Waterman 
Score 


identity 


900 


P60657 


Homo sapiens 


Sequence of human lipocortia 


1835 


100 


901 


M27288 




uuujbia.uii ivi 


1297 


99 


902 


W85737 




_ ■* wim imnamemprane QOIDttlll. 


749 


100 


903 


Q01349 


Homo sapiens 


Human secreted protein, SEQ ID NO: 5430. . 


650 


99 " 


904 


Y00261 


Homo Sflnien? 




1133 


99 


905 


" AF039688 


Homo sapiens 


antigen NY-CO-3 


771 


99 


906 


AB00783* " 


nui uu . d up jciis 


niC-j 


2544 


100 


907 


AB017507 


iiVllliv ottpiClUt 




224 


100 1 


90S 


AK000056 


Horn ft ?9ni^nc 


unnamed protein product 


1537 


98 


909 


Y86299 


nuinu sapiens 


Human secreted protein HFOXB55, SEQ ID 


427 


100 


910 


AF231023 


tAUIUv OaplWJld 


proiocaanenn flamingo i 


7393 


99 


911 


Y14134 


nwuu sapiens 


Vascular endothelial cell growth inhibitor beta 
protein sequence. 


1319 


100 


912 


290420 


rnjiiiu sapimis 


Human uur-j (nGDr-3) polypeptide encoding 


1950 


100 


913 


Y19757 




ocy ijj nu tf/j trom W09922243. 


1361 


100 


914 


"G03172 "' 




Human secreted protein, SEQ ID NO; 7253. 


112 


48 


915 


U14971 


Homo ^AniAiiQ 


noosomai protein o? 


886 


90 


916 


AF 1 72 854 


Aiuiuu aapivHls 


cardiotroph in-like cytokine CLC 


1204 


99 


917 


AC005525 


nuiiiu sapiens 


rZzlo<£ 1 


1963 


100 


918 


AF166350 


iiuniu bapiens 


M/ protein 


4711 


99 


919 


Y87285 


riuiiiu sapiens 


Human signal peptide containing protein HSPP- 
ox oty il» iNu.o*. 


430 


100 


920 


Y36131 


iiuiiKi sapiens 


Human secreted protein #3. 


465 


88 


921 


AF193766 


Homo sapiens 


cytokine-like protein C17 


724 


100 


922 


Y95013 


Homo sapiens 


Human secreted protein vc48 1, SEQ ID NO:66. 


357 


100 


923 


X75208 


nviuv sapiens 


protein tyrosine kinase-receptor 


5256 


100 


924 


Y96202 


Homo sapiens 


IkappaB kinase (IKK) binding protein, Y2H56. 


813 


98 


925 " 


AB039886 


riuxiiu sapiens 


down-regulated in gastric cancer 


785 


78 


926 


G03368 


numo sapiens 


Human secreted protein, SEQ ID NO: 7449 


55 


50 


927 


Y48606 


nuiuu sapiens 


Human breast tumour-associated protein 67. 


539 


100 


928 


Y36151 


Homo sapiens 


Human secreted protein #23. 


668 


100 


929 


AF 110399. 

C XX I JUJ77 


Homo sapiens 


elongation factor Ts 


1666 


100 


930 


AF210317 


Homo sapiens 


facilitate glucose transporter family member 
rii i rro 


2763 


99 


931 


Y73328 


iiuuiij sapiens 


h J km clone G82843 protein sequence. 


931 


100 


932 


G01959 


Homo sapiens 


Human secreted protein, SEQ ID NO: 6040. 


274 


100 


933 


U47924 


Homo sapiens 


B-cell receptor associated protein 1 


1469 


100 


934 


G03827 


Homo sapiens 


Human secreted protein, SEQ ID NO: 7908. 


529 


93 


935 




Homo sapiens 


mitochondrial ABC transporter 3 


196 


"63 


936 


X5&85 


Canis 
familiar is 


rab8 


1064 


100 


937 


B08906 

UU07W 


noma sapiens 


- . 

Human secreted protein sequence encoded by 

nanA 1 /L OTTO TT\ VT/\.<TO 

gene io aJiQ ID N0.63. 


117 


44 ~ 


938 


M13692 


Homo sapiens 


alpha- 1 acid glycoprotein precursor 


1064 


99 


939 


Y53886 


numo Sapiens 


A suppressor of cytokine signalling protein 


515 


42 


"940 


Y16630 


Homo sapiens 


Human Putative Adrenomedullin Receptor 
(PAR). 


1904 


99 


941 


AC005102 


Homo sapiens 


small inducible cytokine subfamily A member 


627 


99 


942 


M12886 


numu sapiens 


l -ceil receptor beta chain 


1289 


81 


943 


AF226046 


Homo sapiens 




1049 


98 


"943 " 


Y36078 


Homo sapiens 


Extended human secreted protein seauence. SFO 
ID NO. 463. 


667 


inn 

1UU 


945 


M22877 


Homo sapiens 


cytochrome c 


565 


100 


946 


W67869 


Homo sapiens 


Human secreted protein encoded by gene 63 
clone HHGDB72. 


551 


93 


947 


W67859 


Homo sapiens 


Human secreted protein encoded by gene 53 
clone HBMCL41. 


283 


300 ~ 


948 


W85726 


Homo sapiens 


Novel protein (Clone BG33 7). 


789 


100 


949 


AJ242015 


Homo sapiens 


eMDCII protein 


4236 


100 


"956" 


G04075 


Homo sapiens 


Human secreted protein, SEQ ID NO: 8156, 


567 


99 



124 
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SEQ 

ID 

NO: 


Accession I 
No. 


Species I ] 


Description 1 
candidate tumor suppressor p33 ING1 homolog 


§nutF r 
Waterman 
Score 
1314 


Identity 
100 


951 
952 

953 
954 
955 


AFU0645 J 
Y36111 

AB012109 
AF246221 | 
AF054986 


Homo sapiens 
Homo sapiens 

Homo sapiens 
Homo sapiens 1 
Homo sapiens 1 


Extended human secreted protein sequence, SEQ 
ID NO. 496. 

APCIO I 
transmembrane protein BRI 1 
putative transmembrane GTPase j 


402 
990 

L*r\JJ 

1883 


70 

100 
100 
100 


956 
957 
958 
959 

960 
961 


W74726 
Y27096 
AJ222967 [ 
Y53052 

G02694 
AF151855 | 


Homo sapiens 
Homo sapiens 
Homo sapiens 
Homo sapiens 1 

Homo sapiens 
Homo sapiens 


Human secreted protein fg949 3. 1 
Human viral receptor protein (ACVRP). 1 
cystinosin" I 
Human secreted protein clone df2Q2_3 protein j 
sequence SEQ ID NO: 110. 
Human secreted protein, SEQ ID NO: 6775. j 
CGI-97 protein 1 
diabetes mellitus type 1 autoantigen 1 


1879 
1581 
1920 
587 

283 

S56 


100 
100 
100 
100 

100 

"96 


962 
963 
964 
965 

966 
967 
968 


02^594 
AL050306 
AF078859 
AB020315 

X04571 

Ax 14601 y j 

AF071002 


Homo sapiens 
Homo sapiens 
Homo sapiens 
Homo sapiens 

Homo sapiens 

Unmn ponl pnc 1 

rlOuiu sKtpicua \ 

Homo sapiens 


dJ475B7.2 (novel protein) 1 
PTD004 1 
homologue of mouse dkk-1 gene:Acc# I 
AF030433 

precursor polypeptide (AA -22 to 1185) 
hepatocellular carcinoma antigen gene 520 T 
minK-related peptide 1; IviiRPl f 
membranc-type-5 matrix metalloproteinasc I 


3796 
2089 
1466 

6580 
993 
632 
3545 


100 
100 
100 

99 
99 
100 
100 


969 
970 
971 
972 
973 

974 
975 


AB021227 j 
AFI80920 
AF 105365 
AF083248 | 
AJ132429 

W61619 
AFI55100 


Homo sapiens 
Homo sapiens 
Homo sapiens \ 
Homo sapiens 
Homo sapiens 

Homo sapiens 1 
Homo sapiens 


cyctin L ania-6a 

K-Pl cotrnnsDorter KCC4 1 
ribosomal protein L26 homolog ! 
hypcrpolarizatjon-activated cyclic nucleotide 

crfltpri r_ai"irin channel hHCN4 1 

Clone HTPEF86 of TM4SF superfamily. 
zinc ringer protein NY-REN-2 1 antigen 


1579 
5621 
739 
6295 

454 
2261 


100 
99 
100 
100 

165 

100 


976 
977 
978 


AF275948 
AB026891 
AF 117657 


Homo sapiens 
Homo sapiens 
Homo sapiens 


ABCA1 

i-.wt?f5 TM»/trli it*jTTK*t*» trun snorter 

thyroid hormone receptor-associated protein 
complex component TRAP80 


11763 

2552 

3348 


99 

100 

99 


979 
980 


AF044201 
AF 119297 


Rattus 
| norvegicus 
Homo sapiens 


neural membrane protein 35; NMP35 
[ neuroendocrine-specificproteinrlike protein 1 


1570 
1170 


92 
99 


981 
982 

983 


AF155652 
W88499 

Z56281 


Homo sapiens 
Homo sapiens 

Homo sapiens 


1 potassium channel modulatory factor 

1 Human stomach carcinoma clone HP 1041 2- 

encoded protein. 
1 interferon regulatory factor 3 


1983 
1553 ' 

2012 


99 
99 

98 


984 
985 

986 


AB026125 
Y 14482 

AB023888 


1 Homo sapiens 
1 Homo sapiens 

Homo sapiens 


ART-4 

j Fragment of human secreted protein encoded by 
gene 17. 

b-chemokine receptor CCR4 


F2I60 
1 172 

1895 


100 
/u 

100 


987 
988 


W27291 
AF153450 


Homo sapiens 
Manduca 
1 scxta 


Human H1075-1 secreted protein 5' end. 
| juvenile hormone esterase binding protein 

Human secreted protein, SEQ ID NO: 7778. 


712 

1 ZJ.O 

J___ 


100 


88 


989 
990 


G03697 
AF204159 


1 Homo sapiens 
Homo sapiens 


1 potassium large conductance calcium-activated 
j channel beta 3 a subunit 
Human secreted protein, SEQ ID NO: 6142. 


1486 
558 


100 
99 


991 
992 

993 


G02061 
AL031266 

Y66749 


Homo sapiens 
1 Caenorhabditi 
Is elegans 
I Homo sapiens 


VM106R.1 

I Membrane-bound protein PROl 124. 
Human secreted protein, SEQ ID NO: 5327. 


327 

4730 
141 


40 

99 
77 


994 
995 
996 


G01246 
AF133845 
AF1 17756 


J Homo sapiens 
1 Homo sapiens 
Homo sapiens 


corin 

thyroid hormone receptor-associated protein 
| complex component TRAP 1 50 
Human stem cell antigen 2. 


t 5811 
4999 

284 


100 
93 


997 
998 

999 

1000 

1001 


W62066 
Y87173 

Y13379 
Y95008 
AF190167 


Homo sapiens 
Homo sapiens 

Homo sapiens 
Homo sapiens 
Homo sapiens 


Human secreted protein sequence SEQ ID 
NO-.212 

1 Amino acid sequence of protein PR0263. 

Human secreted protein vS 1, SEQ ID NO:56. 
1 membrane associated protein SLP-2 


725 

1654 
676 
1 1747 


100 

99 
47 
100 



125 



WO 01757188 PCT/US01/03800 



SEQ 
ID 

NO: 


Accession 
No. 


Species . 


Description 


Smith- 
Waterman 
Score 


% 

Identity 


1002 


G01234 


Homo sapiens 


Human secreted protein, SEQ ID NO: 53 15. 


398 


96 


1003 


#73426 


Homo sapiens 


Human secreted protein encoded by Gene No. 
24. 


2150 


100 


1004 


X12791 


Homo sapiens 


19JcD SRP-protein (AA 1 • 144) 


742 


100 


1005 


M23323 


Homo sapiens 


membrane protein 


642 


100 


1006 


X63745 


Homo sapiens 


KDEL receptor 


326 


98 


1007' 


Y35997 


Homo sapiens 


Extended human secreted protein sequence, SEQ 
ID NO. 382. 


824 


99 


iGuS 


AB032yi« 


Hylobates 
moloch 


dopamine receptor D4 


92 


35 


1009 


Y91680 


Homo sapiens 


Human secreted protein sequence encoded by 
gene81SEQlDNO:353. • 


1372 


99 


1010 


AL136125 


Homo sapiens 


dJ304B14.1 (novel protein) 


825 


98 


101 1 


G03733 


Homo sapiens 


Human secreted protein, SEQ ID NO: 7814. 


379 


98 


1012 


Y1753I 


Homo sapiens 


Human secreted protein clone BL205 14 protein. 


818 


97 


1013 


G00724 


Homo sapiens 


Human secreted protein, SEQ ID NO: 4805. 


462 


100 


1014 


AF288092 


Naegleria 
giuberi 


haem lyase 


114 


37 


1015 


AB045292 


Homo sapiens 


M83 protein 


3867 


99 


1016 


X15940 


Homo sapiens 


ribosomal protein L31 (AA 1-125) 


644 


100 


1017 


Y94873 


Homo sapiens 


Human protein clone HP02632. 


1876 


100 


1018 


AL024498 


Homo sapiens 


dJ417M14.1 (novel protein) 


589 


100 


1019 


X83425 


Homo sapiens 


Lutheran blood group glycoprotein 


3054 


99 


1020 


W03516 


Homo sapiens 


Prostaglandin DP receptor. 


1864 


100 


1021 


G03960 


Homo sapiens 


Human secreted protein, SEQ ID NO: 8041. 


398 


100 


1022 


Y91689 


Homo sapiens 


Human secreted protein sequence encoded by 
gene93SEQIDN0:362. 


768 


100 


1023 


AE000660 


Homo sapiens 


hADV36Sl 


573 


100 


1024 


AF 132965 


Homo sapiens 


CGI-31 protein 


1550 


100 


1025 




Homo sapiens 


Human TR-interacting protein SI 03 a. 


1466 


$7 


1026 


R66278 


Homo sapiens 


Therapeutic polypeptide from glioblastoma cell 
line. 


830 


100 


1027 


X65614 


Homo sapiens 


SI OOP calcium-binding protein 


476 


100 


1028 


Y41741 


Homo sapiens 


Human PRO704 protein sequence. 


1323 


100 


1029 


AJ0010I4 


Homo sapiens 


RAMP1 


806 


too 


1030 


W63682 


Homo sapiens 


Human secreted protein 2. 


1354 


99 1 


1031 


AK023007 


Homo sapiens 


unnamed protein product 


766 


100 


1032 


W97900 


Homo sapiens 


Human SR-BI class B scavenger. 


2672 


99 


1033 


Y82453 


Homo sapiens 


Human TGC-440 secretory protein SEQ ID 
NO:l. 


639 


99 


1034 


Y73473 


Homo sapiens 


Human secreted protein clone ydl78J protein 
sequence SEQ ID NO: 168. 


752 


93 


1035 


Y86468. 


Homo sapiens 


Human gene 48-encoded protein fragment, SEQ 
IDNO:383. 


96 


90 


1036 


U09813 


Homo sapiens 


mitochondrial ATP synthase summit 9 precursor 


698 


100 


1037 


AJ242832 


Homo sapiens 


caftpain 


3699 


99 


1038 


X66403 


Homo sapiens 


acetylcholine receptor epsilon subunit CHRNE 


2574 


100 


1039 


AJ242730 


Homo sapiens 


polyhomeotic 2 


1310 


100 


1040. 


AF169968 


Mus 

musculus 


DNA binding protein DESRT 


1453 


80 


1041 


X52563 


Bos taunis 


permabiliry increasing protein 


383 


29 


1042 


G00368 


Homo sapiens 


Human secreted protein, SEQ ID NO: 4449. 


75 


50 


1043 


G02532 


Homo sapiens 


Human secreted protein, SEQ ID NO: 6613. 


60 


53 


1044 


M94582 


Homo sapiens 


interleukin 8 receptor B 


1850 


100 


1045 


AL080239 


Homo sapiens 


bG256022.1 (similar to IGF ALS (insulin-like 
growth factor binding protein, acid labile 
subunit)) 


1704 


50 


1046 


AF125101 


Homo sapiens 


HSPC040 protein 


580 


100 


1047 


W74809 


Homo sapiens 


Human secreted protein encoded by gene 8 1 
clone HMWDN32, 


176 


100 


1048 


AL022238 | 


Homo sapiens 


dJ1042Kl0.4 (novel protein) 


2201 


100 


1049 


W88667 


Homo sapiens 


Secreted protein encoded by gene 1 34 clone i 
HAIBP89. 


1559 


99 


1050 


AF097518 


Homo sapiens 


Jiver-specific transporter 


2820 


100 



126 



WO 01/57188 



PCTVUS01/03800 



SEQ "' 

ID 

NO: 


Accession 
No. 


Species 


Description 


Smith- 
Waterman 
Score 


% 

Identity 


1051 


W78324 


Homo sapiens 


Fragment of human secreted protein encoded by 
gene 81. 


1318 


98 ~" 


1052 


Y21851 


Homo sapiens 


Human signal pepttde-corm'aning protein (SIGP) 
clone ID 2328134). 


1643 


95 


1053 


AL163815 


Arabidopsis 
thaliana 


putative protein 
uman secreted protein encoded by gene 77. 


661 
162 


62 
100 


1054 
1055 


Y76200 
AJ276567 


Homo sapiens 
Homo sapiens 


TOO-likcRho GTPasc • 


1160 
154 


100 

yo 


1056 
1057 


Y27620 
D14530 


Homo sapiens 
Homo sapiens 


] fuman secreted protein encoded by gene no*. 
ribosomal protein 


745 
1132 


100 
100 


1ACO 

1058 
1059 


AT 1 JzUUU 
ALUJ1 / lo 


■Tnmo CflTilMK; 
lUUlv oaJJiwiia 


TADA1 protein 

dJ34B2M (novel BZRP (benzodiazapine 
receptor (peripheral) (MBR, PBR, PBKS, IBP, 
Isoquino line-binding protein)) LIKE protein) 


920 
134 


100 

33 


1060 
1061 ' 
1062 


AF227135 

Y27575 

Z11697 


Homo sapiens 
Homo sapiens ^ 
Homo sapiens 


candidate taste receptor T2R9 

Human secreted Drotein encoded bv Eene No. 9. 

HB15 


1392 
1088 
819 


100 
100 
100 


1063 
1064 
1065 


AF 123757 
AF155135 
Y41674 


lomo sapiens 
Homo sapiens 
Homo sapiens 


HOYvl jClinai pigniGiil v^iuiviioi vw piuiviu 
Unman r»Vi«tini»1».rf»TatRfl molecule HCRM.-2. 


2932 
936 


99 
99 


1066 


AJ250042 


fomo sapiens 


Iv3x)^ \jxJrt\J lit wU/iuuigc iom*ji 


2575 


100 


1067 


Y36087 


lomo sapiens 


cvt-anAaA tinman cAprptpH nmtftin teouenceL SEO 
irv>Jfi ATI 


770 


85 


1068 


Y94959 


Homo sapiens 


Human secreted protein clone mc300_l protein 
sentience SEO ID NO: 124. 


301 


100 


1069 


Y94959 


Homo sapiens 


Human secreted protein clone mc300_l protein 
sequence SEQ CD NO: 124. 


301 


100 


1070 


W64535 




Human leukocyte cell clone HP00804 protein. 


2014 


99 


1071 


X03145 


riomo sapiens 


nnt ORFIII 


148 


50 


1072 


AL031177 


Homo sapiens 


HTJtROMl S ~K ( novel ttroteinl 


821 


91 


1073 


X8220O 


Homo sapiens 




249 


62 


1074 


G03213 


Homo sapiens 


Human secreted DroteirL SEO ID NO: 7294. 


99 


47 


1075 
1076 


Y36233 
G03187 


Homo sapiens 
Homo sapiens 


Human secreted protein encoded by gene 10. 
Human secreted orotein. SEO ID NO: 7268. 


506 
424 


55 
98 


1077 
1078 


L25899 

Y91447 


Homo sapiens 
Homo sapiens 


riKr>Qnmfl1 nrntftin \ A 0 

Hitmnn secreted Drotein seouence encoded by 
zene 48 SEO ID NO: 168. 


332 
898 


76 
97 


1079 


(jUloOx 




Human secreted protein, SEQ ID NO: 5943. 


290 


89 


1080 




Unmn CAntPTIC 


VVNT receptor trizzlecl-3 


1376 


92 


Wol 




Hnmn cuniens 


Na/P04 cotransporter homolog 


269 


100 


1 f\Q*i 




Unmfi ccmjpne 


ribosmal protein small subunit 


499 


80 


lUoi 




Hnrnn cnniPtr; 


Human secreted protein encoded by gene 42 
clone HSXBI25. 


143 


81 






Hrttrm enni S 


Human secreted protein, SEQ ID NO: 7645. 


83 


51 








Human secreted protein, SEQ ID NO: 8144. 


88 


43 


1086 


AF090942 


Homo sapiens 


PRO0657 


124 


64 


1087 


UUUM / 


IlAmn eanipnc 

riomo supicii* 


Human secreted protein, SEQ ID NO: 4598. 


129 


41 


1088 


G04091 


Homo sapiens 


Human secreted protein, SEQ ID NO: 8172. 


126 


36 


1089 


Arl4Uo3l 




G-protein coupled receptor 14 


364 


82 


1090 


UU4uoj 




Human secreted orotein. SEO ID NO: 8144. 


Hi 


32 


1091 


S72304 


Mussp. 


I MW fr-nrntein 


146 . 


83 


1092 


W88708 


Homo sapiens 


Secreted protein encoded by gene 175 clone 
HEMAM41.. 


405 


100 


1093 


W85612 


Homo sapiens 


Secreted protein clone fhl23__5. 


4358 


97 


1094 


Y53012 


Homo sapiens 


Human secreted protein clone pm514__4 protein 
sequence SEQ ID NO^O. 


1013 


99 


1095 


Y92345 


Homo sapiens 


Human cancer associated antigen precursor from 
clone NY-REN-62. 


409 


100 


1096 


AF090942 


Homo sapiens 


PRO0657 


147 


60 


1097 


L24521 


Homo sapiens 


transformation-related protein 


166 


58 


1098 


X56932 


Homo sapiens 


23 kD highly basic protein 

Human secreted protein, SEQ ID NO: 8144. 


" 490 
83 


70 

■*35 


1099 
1100 


G04063 
Y02693 


Homo sapiens 
Homo sapiens 


Human secreted protein encoded by gene 44 
" clone HTDAD22. 


~ 149 


" 59 



127 



WO 01/57188 



PCT/US01/03800 



LU 

NO- 


Accession 

MO. 


Species 


Description 


Smith- 

Waterman 

Score 


% 

Identity 


1101 


AF119851 


Homo sapiens 


PR01722 


183 


72 


i in? 


UU4UoO 


Homo sapiens 


Human secreted protein, SEQ ID NO: 8167. 


207 


62 


1103 


G04063 


Homo sapiens 


Human secreted protein, SEQ ID NO: 8144. 


91 


52 . j 


i if\A 
1 IU4 


a74o5o 


Mus 

musculus 


ribosomal protein L28 


128 


69 


1 in* 


UUJ/89 


Homo sapiens 


Human secreted protein, SEQ ID NO: 7870. 


130 


62 


J iUO 




Homo sapiens 


Human secreted protein, SEQ ID NO: 7214 


122 


48 i 


1107 


G03040 


Homo sapiens 


Human secreted protein, SEQJD NO: 7121. 


69 


43 


■I iVO 


AFG39942 


Homo sapiens 


HCF-binding transcription factor Zhangfei 


744 


. 99 


1 Ivy 


A*2QlSol 


Homo sapiens 


high affinity immunoglobulin epsilon receptor 
beta subunit 


738 


94 


i 1 in 


a n 1 1 mo 
Ar UIIUo 


Mus 

musculus 


transient receptor potential 2 


223 


79 


1111 
J J J J 


Ar J 19900 


Homo sapiens 


PR02822 


144 


59 


1114 




Homo sapiens 


A protein that interacts with presenilins. 


265 


39 


1113 


002872 


Homo sapiens 


Human secreted protein, SEQ ID NO: 6953. 


178 


67 


1 1 i/i 

J 1 14 


YU2999 


Homo sapiens 


Fragment of human secreted protein encoded by 
gene 121. 


164 


63 


1 115 


YJUoil 


Homo sapiens 


Human secreted protein encoded from gene 1. 


1217 


99 


1 MA 
1 110 


A51j94 


Xenopus 
laevis 


APEG precursor protein 


130 


40 


1 1 M 
ill/ 


MZ/oio 


Homo sapiens 


neutral protease large subunit 


442 


65 


I llo 


uUjjyI 


Homo sapiens 


Human secreted protein, SEQ ID NO: 7452. 


72 


6b 


1 110 
1 1 iy 


UUJOUZ 


Homo sapiens 


Human secreted protein, SEQ ID NO: 7683. 


491 


97 


1120 


Y35906 


Homo sapiens 


Extended human secreted protein sequence, SEQ 
ID NO. 155. 


244 


97 


i nt 

J 1Z1 


vjyj/14 


Homo sapiens 


Human secreted protein, SEQ ID NO: 7795. 


122 


<J5 


1122 


Y00337 


Homo sapiens 


Human secreted protein encoded by gene 81. 


110 


90 


i ni 
1123 


Af 084830 


Homo sapiens 


two pore domain channel; TASK-2 


703 


94 


1 144 


Ar212862 


Homo sapiens 


membrane interacting protein of RGS16 


442 


88 


1125 

1 


W64469 


Homo sapiens 


Human secreted protein from clone CW795 2. 


191 


53 


1126 


001361 


Homo sapiens 


Human secreted protein, SEQ ID NO: 5442 


154 


100 


1127 


O01361 


Homo sapiens 


Human secreted protein, SEQ ID NO: 5442. 


165 


100 


1128 


Y84320 




nuindn uuaiovascuiar system associated protein 
kinase- 1. 


815 


99 


1129 


G02105 


Homo sapiens 


Human secreted protein, SEQ ID NO: 6186. 


88 


73 


1130 


Y32923 


Homo sapiens 


Transmembrane domain containing protein clone 
HP01512. 


700 


100 


113 1 


Y29817 


3omo sapiens 


Human synapse related glycoprotein 2. 


260 


91 


1132 


Y91644 


Homo sapiens 


Human secreted protein sequence encoded by 
gene 43 SEQIDNO:317. 


525 


96 


1133 
1134 


Y91449 
AB017908 


Homo sapiens 
Homo sapiens 


Human secreted orotein seauencp phpaHpH hu 

gene49SEQIDNO:170. 

4F2 light chain 


j4z 


100 


1135 


X51760 


Homo sapiens 


zinc finger protein (583 AA) 


2399 

310 


93 


1136 

1 1 J r 


Y99426 
nni 7on 


Homo sapiens 
Homo sapiens 


Human PRO1604 (UNQ785) amino acid 

sequence SEQ ID NO:308. 

Human secreted protein, SEQ ID NO: 7871 


917 
102 


72 
50 


1138 
1139 

1 IHV 


AF155106 
AL031055 ' 


iomo sapiens 
lomo sapiens 

Bos taurus 


NY-REN-36 antigen 

dJ28H20. 1 (novel protein similar to membrane 
transport proteins) 


768 
117 


91 ■ 
50 


1141 
1142 


Y70018 
G04091 


Homo sapiens 
'. iomo sapiens 


regulator of G-protein signaling 7 

Human Protease and associated protein-12 j 

(PPRG-12). 

rluman secreted nrntpin kph in >jn» sno 


13a 

623 
Hi 


fob 


1143 


AB030235 


Canis 
farailiaris 


41UUIHU owibUU piUlGlll, u£.y l\J ©1 /*• 

D4 dopamine receptor 


113 
89 


38 
48 


1144 

1145 
1146 


Y94922 

X99962 
G03807 


Homo sapiens 

Homo sapiens 
Homo sapiens 


Human secreted protein clone pv6 1 protein 
sequence SEQ ID NO:50. 
rab-related GTP-binding protein 
Human.secreted protein, SEQ ID NO: 7888. 


539 

398 
168 


88 

96 
79 


1147 " 

1148 

1149 


G03712 
Y28279 
U13642 


Homo sapiens 
[omo sapiens 
Caenorhabditi 


Human secreted protein, SEQ ID NO: 7793. 
Human G-protein coupled receptor GRIR-1. 
sxon 5 similar to transmembrane domain of S. 


512 
705 
247 


85 
76 
36 



128 



WO 01757188 



PCT/US01/03800 




129 



WO 01/57188 



PCTYUS01/03800 



ID 
NO: 


Accession 

Kn 


opecies . 


Description 

" 


Smith- 
Waterman 
Score 


% 

Identity 


1201 


G00838 


nuuiu Sapiens 


Human secreted protein, SEQ ID NO: 4919. 


404 


50 


1202 


M27$26 J 

/ UAV 


nuuiu sapiens 


neutral protease large subunit 


202 


49 


1203 


Y73424 


Homo sapiens 


Human secreted protein clone yi4_l protein 
sequence SEQ ID NO:70. 


265 


61 


1204 


AF264014 


Homo sapiens 


scavenger receptor cysteine-rich type 1 protein 
M160 precursor 


625 


98 


1205 


Y36203 


nuuiu sapiens 


Human secreted protein #75. 


219 


59 


1206 


U78111 


VJaUUS gallUS 


AQ 


205 


57 


1207 




Homo snpiens 


putative G protein-coupled receptor 


416 


76 


1208 


AF116715 


Homo sapiens 


PR02829 


127 


75 


1209 




Homo sapiens 


MaxiK channel beta 2 subunit 


475 


95 


1210 




Homo sapiens 


hepatocellular carctaoma-related putative tumor 
suppressor 


423 


79 


121 1 






Human secreted protein encoded by gene No. 
107. 


224 


70 


1212 


G00719 


Homo sapiens 


Human secreted protein, SEQ ID NO: 4800. 


117 


44 


1213 


GOI009 


Homo sapiens 


Human secreted protein, SEQ ID NO: 5090. 


351 


73 


1214 


AF090942 


Homo sapiens 


PRO0657 


124 


70 






Homo sapiens 


Human secreted protein encoded by gene 17 
clone HSIEA14. 


99 


77 


1216 


G03905 


Homo sapiens 


Human secreted protein, SEQ ID NO' 7986 


173 


D J 


1217 


Y57897 


Homo sapiens 


Human transmembrane protein HTMPN-2I. 


1173 


100 


1218 


J00194 


Homo sapiens 


hla-dr antigen alpha chain 




7ft 
/o 


1219 


Y59709 


Homo sapiens 


Secreted protein 76-28-3-A12-FL1. 


470 


92 


1220 


W51D /0 


Homo sapiens 


EBV-mduced G-protein coupled receptor (EBI- 
2) polypeptide. 


725 


100 




Wyo745 


Homo sapiens 


High affinity immunoglobulin E receptor-like 
protein (IGERB). 


650 


98 


1222 
1223 


Y35911 

■VnA078 


Homo sapiens 
Homo sapiens 


Extended human secreted protein sequence. SEQ 
ID NO. 160. 


135 


31 


1224 
1225 


AP1£1A')') 

/vrioi'h&z 
U14970 


Homo sapiens 
Homo sapiens 


Human secreted protein encoded by gene 21. 
HSPC304 

ribosomal protein S5 


260 
568 


95 
90 


1226 




Homo sapiens 


Human secreted protein, SEQ ID NO: 58 14. 


202 
610 


95 
100 


1227 


AF099973 


Mus 

musculus 


schlafeti2 


333 


56 


1228 


G01218 


Homo sapiens 


Human secreted protein, SEQ ID NO: 5299. 


155 


81 


1229 


AF217188 


Mus 

musculus 


YIP1B ■ * 


801 


63 


1230 


AF176813 


Homo sapiens 


soluble adenylyl cyclase 


275 


100 


1231 


X98333 


Homo sapiens 


organic cation transporter 


1704 


100 


1232 


W74955 


Homo sapiens 


Human secreted protein encoded by gene 77 
clone HOEAS24. 


212 


53 


1233 


Y94940 


Homo sapiens 


Human secreted protein clone yi62 1 protein 
sequence SEQ ID NO:86. 


526 


100 


1234 

1235 
1236 


U76618 

AF044924 
G01459 


Mus 

musculus 

nuuiu sapiens 

Homo sapiens 


N-RAP 
hook2 protein 


482 
380 


Da 

97 


1237 
1238 

T239 


AF000018 
W88633 

W29660 


Homo sapiens 
Homo sapiens 

Homo sapiens 


Human secreted protein, SEQ ID NO: 5540. 
adapter protein 

Secreted protein encoded by gene 100 clone 
HE8EU04. 


417 
164 
250 


100 

84 

90 


1240 
1241 


AF00416I 
Y92710 


Oryctolagus 
cuniculus 
Homo sapiens 


iomo sapiens CH27 1 clone secreted protein 
peroxisome Ca-dependent solute carrier i 


697 
154 


52 


1242 


Y95002 


Homo sapiens 


Human membrane-associated protein Zsig24. 
luman secreted protein vc34 l,SEQIDNO:44. 


709 
908 


97 
88 


1243 
1244 


Y44905 
AF284422 


Homo sapiens 
Homo sapiens 


Human potassium channel molecule ERG-LP2 
partial protein. 


325 


100 


T245 
1246 '" 


Y53629 
AB039371 


Homo sapiens 
Homo sapiens 


cation-chloride cotransporter-interacting protein 
A bone marrow secreted protein designated 
BMS115. 


511 
1888 


97 
93 


1247 


Y3591I 


Homo sapiens 


mitochondrial ABC transporter 3 

Extended human secreted protein sequence, SEQ 


389 

168 | 


97 
39 
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SEQ" ~ 

ID 

NO: 


Accession 

No." 


Species 


Description 
DNO. 160. 


Smith- 

Waterman 

Score 


% 

Identity 


1248 


AF072509 


Rattus 
norvegicus 


glutamate receptor interacting protein 2 


Tc5 

559 


on 
y{J 


1249 


AF247042 


Homo sapiens 


tandem pore domain potassium channel 1KAAA 


00 1 


yo 


1250 


B08974 


Homo sapiens 


Human secreted protein sequence encoded by 
gene27SEQIDNO:131. 


1U5 f 


y / 


1251 


L15313 


Caenorhabditi 
s elegans 


putative 




"59 


1252 


Y29338 


Homo sapiens 


Human secreted protein clone it21 7_2 alternate 
reading frame protein. 




75 

In 


1253 


W01730 


Homo sapiens 


Human G-protein receptor HPRAJ70. 


11 1 

21 1 




1254 


G03074 


Homo sapiens 


Human secreted protein, SEQ ID NO: 7155. 






1255 


G01818 


Homo sapiens 


Human secreted protein, SEQ ID NO: 5899. 


253 


01 

y* 


1256 


AF286368 


Homo sapiens 


eppin-1 


222 


54 


1157 


AF2202& 


Homo sapiens 


MOST-1 


87 1 


yj 


1258 


G02227 


Homo sapiens 


Human secreted protein, SEQ ID NO: 6308. 


281 


7o 


1259 


Y07970 


Homo sapiens 


Human secreted protein fragment #2 encoded 
from gene 26. 


81 


94 


1260 


R95332 


Homo sapiens 


Tumor necrosis factor receptor 1 death domain 
Hgand (clone 3TW). 


986 


1100 


1261 


AF140674 


Homo sapiens 


zinc metalloprotease ADAMTS6 


172 


36 


1262 


U28369 


Homo sapiens 


semaphorin V 


237 


67 


1263 


Y07049 


Homo sapiens 


Renal cancer associated antigen precursor 
sequence. 


288 ! 


71 


1264 


Y36153 


Homo sapiens 


Human secreted protein ^25. 


187 


80 


1265 


Y78114 


Homo sapiens 


Human cytokine signal regulator CKSR-2 SEQ 
IDNO:2. 


723 


93 


1266 1 


Y13397 


Homo sapiens 


Amino acid sequence of protein PR0334. 


191 


100 


1267 


AF030558 


Rattus 
norvegicus 


phosphatidylinositol 5-phosphate 4-kinase 
gamma 


859 


95 


1268 


U73167 


Homo sapiens 


candidate tumor suppressor gene LUCA-1 


159 


96 


1269 


AF190664 


Mus 

musculus 


LMBR2 


552 


76 


1270 


AL050332 


Homo sapiens 


dJ570F3. 1 (homolog of the rat synaptic ras 
GTPase-activating protein pl35 SynGAP) 


820 


98 


1271 


G02126 


Homo sapiens 


Human secreted protein, SbQ ID NO: 6207. 


131 


95 


1272 


AF125533 


Homo sapiens 


NADH-cytochrome b5 reductase isoform 


253 


92 


1273 


AL035661 


Homo sapiens 


dJ568Cl 1.3 (novel AMP-binding enzyme 
similar to acetyl-coenzyme A synthethase 
(acetate-coA ligase)) 


1280 


100 


1274 


AF064748 


Mus 

musculus 


S3-12 


3523 


£.1 


1275 


D17554 


Homo sapiens 


TAXREB107 


377 


Jo 


1276 


Y30715 


Homo sapiens 


Amino acid sequence of a human secreted 
protein. 


643 


90 


1277 


AF 146760 


Homo sapiens 


septin 2-like cell division control protein 


707 


100 


1278 


Y05069 


Homo sapiens 


Human PIGR-2 protein sequence. 


28 1 


*fO 


1279 


X59668 


Oryctolagus 
cuniculus 


aorta CNG channel (rACNG) 


267 


85 


1280 


G01051 


Homo sapiens 


Human secreted protein, SEQ ID NO: 5132. 


489 


98 

k"i 


1281 


O03411 


Homo sapiens 


Human secreted protein, SEQ ID NO: 7492. 


120 


43 


1282 


AF055084 


Homo sapiens 


very large G-protem coupled receptor-1 


1635 


100 


1283 


AF1 17814 


Mus 

musculus 


odd-skipped related 1 protein 


357 


98 


1284 


U87318 


Xenopus 
laevis 


NaDC-2 


535 


60 


1285 


AF061346 


Mus 

musculus 


Edpl protein 


452 


68 


1286 


AB030182 


Mus 

musculus 


contains transmembrane (TM) region 


582 


68 


1287 


A13595 


synthetic 
construct 


iramunosuppresive protein PP15 


185 


97 


1288 


AF254411 


Homo sapiens 


ser/arg-rich pre-mRNA splicing factor SR-AI 


837 


100 


1289 


AF084205 


Rattus 
norvegicus 


serine/threonine protein kinase TAOl 


319 


98 
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SEQ 
ID 
NO: 
1290 


Accession 
No. 

AF038563 


Species 
Homo sapiens 


besenption 

membrane associated guanylate kinase 2 


Smith- 
Waterman 
Score 


Identity 


1291 
1292 


~ AF034837 
M15888 


Homo sapiens 
Bostaurus 


uuuuic-buanaeo kin a specmc adenosine 
deaminase 


523 
468 


100 
100 


1293 
1294 


AB010692 
AF209923 


ArahldonsiR 
thaliana 
Homo sapiens 


endozepine-related protein precursor 
/vir-gepenaeni kina tiencase-liKe protein 

orphan G-protein coupled receptor 


937 
636 


87 
45 


1295 


W67828 




numan secreted protein encoded by gene 22 
clone HFBAF41. 


1570 
504 


100 
98 


1296 


AC004832 


Homo sapiens 


similar to 45 kDa secretory protein ; similar to 
CAA10644.1{Pn):g4164418) 


648 


65 


1297 
1298 


X80035 
G02645 


Oryctolagus 
cuniculus 
Homo sapiens 


cysteine nch hair keratin associated protein 
Human secreted protein, SEQ ID NO: 6726 


575 
223 


" 70 
97 


1299 
1300 


" V59440 " 
W70504 


Homo sapiens 
Homo sapiens 


Human delta3 fragment 4i. 

Leukocyte seven times membrane-penetrating 

type receptor protein JEG1 8. 


122 
" 459 


32 
81 


1301 

1302 
1303 
1304 


Y67315 

M77693 
001331 
G01491 


Homo sapiens" 

Homo sapiens 
Homo sapiens 
Homo sapiens 


Human secreted protein BL89J3 amino acid 
sequence. 

spennidine/spermmeNl-aceryltransferase 
Human secreted protein, SEQ ID NO: 5412. 
Human secreted protein, SEQ ID NO- 5572 


3916 

174 
254 


99 

96 
69 


1305 
1306 
1307 

1308 
1309 


AF148509 

G01658 

Y90899 

AF033120 
Y73388 


Homo sapiens 
Homo sapiens 
Homo sapiens 

Homo sapiens 
Homo sapiens 


alpha 1,2-mannosidase ' " 
Human secreted protein, SEQ ID NO: 5739. 
Dl-Iike dopamine receptor activity modifying 
protein SEQ ID NO: 1. 
p53 regulated PA26-T2 nuclear protein 


747 
602 
333 
332 

348 


99 
98 
98 
98 

52 


1310 
1311 

1312 


AF063243 
AF224494 

Y73342 


TtnQ tannic 

Mus 

musculus 


HTRM clone 3376404 protein sequence. 

ribosomal protein L30 

arsenite inducible RNA associated protein 


147 

296 
688 


66 
90 

70 " 


1313 
1314 


Y99419 
AF 116667 


Homo sapiens 
Homo sapiens 


Ml KM clone 2709U55 protein sequence. 
Human PRO1780 (UNQ842) amino acid 
sequence SEQ ID NO:282. 
PRO 1777 


1154 
1145 


loo 

78 


1315 
1316 


W75100 
AJ272078 


Homo sapiens 
Homo sapiens 


kuman secreted protem encoded by gene 44 

clone IIE8CJ26. 

APOBEC-1 stimulating protein 


433 
807 

789 


97 
97 

100 


1317 
1318 


AB041533 
U19617 


Homo sapiens 
Mus j 
musculus 


sperm antigen 

"kii-i 


2607 
806 


98 
92 


1319 


U82598 


Escherichia 
coli 


ferric enterobactin transport protein 


768 


linn 


1320 


D90892 


Escherichia 
coli 


SORBITOL^-PHOSPHATE 2- 
DEHYDROGENA8E (EC 1.1.1.140) 
(GLUCITOL-6- PHOSPHATE 
DEHYDROGENASE) (KETOSEPHOSPHATE 
REDUCTASE). 


709 


100 


1321 
1323 


W67847 

AJ276101 
AJ276101 


Homo sapiens 

Homo sapiens 
Homo sapiens 


Human secreted protein encoded by gene 41 [ 
clone HPBCJ74. 
GPRC5B protein 
GPRC5B protein 


601 
466 


92 
93 


1324 
1325 

1326 


Y58628 

U91561 " — 
AF125533 


Homo sapiens 
Rattus 
norvegicus 
Homo sapiens 


Protem regulating gene expression PRGE-21. 
pyridoxine 5'-phosphate oxidase ~~ 


504 

1584 

1277 


97 
100 

89 - 


1327 

1328 
1329 


Y32206 

AF151048 
Y10530 


Homo sapiens 

Homo sapiens 
Homo sapiens 


NADH-cytochrome b5 reductase isoform 
Human receptor molecule (REC) encoded by 
Incyte clone 2825826. 
HSPC214 


1606 
1531 

657 


100 
90 

85 


1330 
1331 

1332 
1333 


AF180681 
AFU1856 

Y13583 
AP078866 


Homo sapiens 
Homo sapiens 

Homo sapiens 
3omo sapiens ~ 


^factory receptor 

guanine nucleotide exchange factor 

sodium dependent phosphate transporter isoform 

NaPi-3b 

j-protein coupled receotor 

SURF4 ' : 


1645 
4314 

5591 

2171 
1395 


100 I 

99 

99 

100 
100 
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SEQ 
ID 
NO: 
1334 


Accession 
No. 

Y25755 


Species 
Homo sapiens 


Description 

Human secreted protein encoded from gene 45. 


Smith- 
Waterman 
Score 
1380 
*742 


Identity 
96 

"95 


1335 
1336 
1337 

1338 
1339 


AF152325 

X74070 

AF095927 

G03877 
AL008582 


Homo sapiens 
Homo sapiens 
Rattus 
norvegicus 
Homo sapiens 
Homo sapiens 


protocadherin gamma A5 
transcription factor BTF3 
protein phosphatase 2C 

Human secreted protein, oiiv^ *u in v. /yjo. 
bEC223H9.2(orthologofA.thaIianaF23Fl.») 
leukemia inhibitory factor receptor 


639 
1931 

621 
626 
5820 


81 
95 

100 
100 
99 


1340 
1341 
1342 
1343 


X61615 
Y01519 
AF207600 
U54807 


Homo sapiens 
Homo sapiens 
Homo sapiens 
Rattus 
norvegicus 


A carcmogenesis-mmbiting protein. j 
ethanol amine kinase 
GTP-binding protein 


7528 
2372 
1167 


97 

100 

97 


1344 


AC020579 


Arabidopsis 
tnaliana 


putative phosphonl>osylformylglycmamidine 
synthase; 25509-29950 


3283 . 
944 — ' 


51 
1O0 


1345 
1346 

1347 


Y28576 
W74787 

M55542 


Homo sapiens 
Homo sapiens 

Homo sapiens 


Secreted peptide clone oe503 1. 

Human secreted protein encoded by gene 58 

clone HHFHN61. 

guanylate binding protein isoform I 


1171 

2636 
1329 


100 

87 
100 


1348 
1349 
135U 


AF183428 

U70669 

AF295530 


Homo sapiens 
Homo sapiens 
Homo sapiens 


28.4 kDa protein 
Fas-ligand associated factor 3 
cardiac voltage gated potassium channel 
modulatory subunit 


167 
562 


24 
99 



TABLE 3 



SEQID 
NO: of 
nucl- 
eotide 
seq- 
uence 


SEQID 

1SU. OI 

peptide 
seq- 
uence 


Met 
noo 


SEQ 
m "MO- 

in 

USSN 
09/496 
914 


Predicted 

hptrinninu 

nucleotide 
location 
correspond! 
ng to first 
amino acid 
residue of 
peptide 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue 
of peptide 
sequence 


Amino.acid sequence (A^Alanine OCysteine, 
D=Aspartic Acid, E^Glutamic Acid, 
F=Phenylalanine, G==Glycine, H=Histidine, 
Msoleucine, KHLysine, L=Leucine, 
M=Methionine, N=Asparagine, P=Proline, 
Q=Glutamine, R=Arginine, S=Serine, 
T-Threonine : V-Valine, W-Tryptophan, 
Y-Tyrosine, X-Unknown, *=Stop codon, 
/=possible nucleotide deletion, possible 
nucleotide insertion 


1 


1351 


A 


2 


337 


1 


" mLIHQAPTPCPAGLWG/PPNGHYHGS'FGLJ 
HWPQAPHRA* * "GLLPPRWLGHGLPGGPAAP 
WAASQWVDGVAGRLPGPAWSWHASGAAPA 
OPGPL*LLVPGSSGLPDPRDP 


2 


1352 


A 


27 


100 


366 


IRNSSIRPMKEB£TKLSAKHM1TCSASYD1RGL 
QIETTVYHHTPIRMAKIQKT/GHHQC**ECGAT 
GTLIHGWWGCKWEPLGKTVWQIPiC 


3 


1353 


A 


40 


3 


314 


. H AS AHAS WLKDN SELEQQLG ATG A YRARA 
LBLEAEVAEMRQMLQLEHPFVNGADKLRPD 
SMYVHLNEL 4 QSLVENMLLTVVDTH\RTPI*R 
SCNYTLALILFL 


4 


1354 


A 


74 


2 


292 


TASALFSCPDGGSLAGFAGRRASFHLECLKR 
QKDRGGDISQKTVLPLHLVHHQVAHTFGQAT 
VTCQQARQSPG*RTNPE/ALQWVLPVSDGWH 
VLPLP 


5 


1355 


A 


78 


114 


850 


' ENCRV ASNLPGVFFSEDTAQSGSYMR1S AHFl? 
NAGGEVSNGPKRKLTLMLNFSLPSSGLNAGA 
FYALSTLLNRMVIWHYPGEEVNAGMGLTIVI 
AGMLGAVISGIWLDRSKTYKETTLVVYIMDT 
GG AWWC YTFYLGTGDTCG* CF1TAGYTMGFF 
MTGYLPLGFEFAVELVSYPESEGISSGLLNISA 
QVFGnFTISQGQIIDNYGTKPGNIFLCVFLTLG 
AALTAFUCADLRRQKANKETLEN 


0 


1356 


A 


81 


97 


376 


EWFSYMLGSNMSVYHSP*SLEPLCK.VLSliS'A 
YLR VPFIRILLNAR* JRKAYKRMSLEIKLLI/RE 
♦CLFQEMGLSLQWLYSARGDFFRATSRL 


1 


1357 


A 


93 


2 


872 


TLSSACLIGDAWKELTIVAGAVSNQLLVWYF 
ATALADNKPVAPDRR1SGHVGIIFSMSYLESK 
GLLATASEDRSVRTVVKGGDLRVPGGRVQNIG 
HCFGHSARVWQVKLLENYLISAGEDCVCLV 
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SEQ ID 
NO: of 
nucl- 
eotide 
seq- 
uence 


SEQ ID 
NO: of 
peptide 
seq- 
uence 


Met 
hod 


SEQ 

BONO: 

in 

USSN 
09/496 
914 


Predicted 

beginning 

nucleotide 

location 

correspondi 

ng to first 

amino acid 

residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue 
of peptide 
sequence 


Amino acid sequence (A»Alanine OCysteine, 
D=Aspartic Acid, E=GIutamio Acid, 
^Phenylalanine, 0«Glycinc, H-Histidinc, 
Hsoleucine, K^Lysine, L-Leucine, 
Methionine, N=Asparagine 1 P=Proline, 
Q=Glutamine, R=Arginine, S=Serine, 
Threonine, V=Valine, WoTryptophan, 
Y=Tyrosine, X=Unknown, *=Stop codon, 
/"possible nucleotide deletion, \=possible 
nucleotide insertion 














WSHEGEILQAFRGHQGRGIRAIAAHEIIQAWV 

iTGGDDSGIRLwHLVGRGYRGLG/DLOSLLQ 

VP**ARYTQGCDSGWLLATAGSD* YRGPVSL 

*RRGQVLGAAARG*TFPVLLPAGGSSWSRGL 

RIVCYGQWGRSCQGCPHQHSNCCCGPDPVS 

WEGAQLELGPAWL 


8 


1358 


A 


106 


3 


350 


FSSLLSGRISTLRDJBTGAIL1DGDPAACAPIIKF " 
LLTEELHLRGVS1YVLRHEAQIYGITPLWCAL 
U/CRRL*SDSCMRAALNDRGLYQVLILDGLV 
QaGFVDSDSRKMVSTLT 


9 


1359 


A 


115 


49 


186 


QAWAIFKGKYKEGDTGGPAVWKTRLRCALN" 
KSSEFNEGPERERMDV 


10 


1360 


A 


123 


2 


1249 


KGCRTQEKVDRTEVIRTCINP\TSKLFTVDFY 

FEEVQRLRFEVHDISSNHNGLKEADFLGGME 

CTLGQIVSQRKLSKSLLKHGNTAGKSSITVIA 

EELSGNDDYVELAFNARKLDDKDFFSKSDPF 

LECFRMNDDATQQLVHRTEWMNNLSPAWK 

SFKVSVNSLCSGDPDRRLKCJVWDWDSNGK 

HDFIGEFTSTFKEMRGAMEGKQVQWECINPK 

YKAJCKKNYK^TSGTVILNLCKIHkMHSFLDYI 

MGGCQIQFTVAIDFTASNGDPRNSCSLHYIHP 

YQPNEYLKALVAVGEICQDYDSDKMFPAFGF 

GARJPPEYTDSHDFAINFNEDNPECAGIQGVV 

EAYQSCFVPKAPTFTGPTNICPHSSRKVAKFRR 

SEGN* HQGRAFAIIFIL VDPGQVGVYSQDMGP 

DNPGGHFV 


11 


1361 


A 


147 


614 


9 


ACARKQIXGRTVFIWFVGQLLGGELKGYSKT 

NTTSSRPASSRG\TLSSSSSSSSSLTKDALPSSL 

KSDSTTITSGLVFPFRSLCVNPAKSSVSESVSSI 

KILLSSSVKYLE*KRTSCCFPDSSESiaSQLSS 

DERVSMGTSSRKPTNSSSSLGALKMSATSW 

SGSESPTPFFLTGLQSPPSTRPREPGLTTARNS 

TTLTRDC 


12 


1362 


A 


177 


12 


416 


LIPSEPALDSLVDPRVRSRKQPFVIYPVYDTAI 

DIKIHFSLLDGNVGEPDMSAGFCPNHKAAM 

VLFLDRVYGIEVQDFLLHLLEGGFLPDLRAA 

ASLDT/AEIGAMDFLLS*LFTLCLMMFFFIYPFI 

NIXTMNVY 


13 


1363 


A 


249 


535 


105 


wtfhrhlspaplivcdqgtcvvsyypOnivq 
mpdtqmeqgln/hlfldgna*phsvecycps 
tfeiaikttsfvlyfhryrapevllrssvysspi 
dvwavgsimaelymlrplfpgtsevdeifkic 
qvlgtpkkvstlvpkll 


14 


1364 


A 


254 


572 


201 


YLLTXIGNLMMLLV1NADSCLRTXM*FFLGH 
FFFLDICYSSVTAQDAAEFPVS*KPILVWGYIT 
♦SFFFIFSWGTNGCLLSAITYACYAAICHPLLS 
TMVMNRPLCTATVNATNKMGFLNSQVN 


15 


1365 


A 


257 


425 


68 


THAKFLKKKFNIPKLVILPKLVYIVKAIPTKM 
AffiFLLECDQNIT\KLICENT*KNIAKNI*KRRV 
TFTPIET* HPVKQMIKWQ*LTAWLRNRGYKKI 
KQTPNSETAPSVCRNLVFDKCG 


16 


1366 


A 


263 


104 


481 


FC1FRTTEEDRGGDDCVVSVWTKQRNNSCVK " 
SKDVFSKPVNIFWALEESVLGVKARQPKPFFA 
AGNTFEMTCKVSSKNDCSPRYSVLIMAEKPV 
GDLSSPNETKY1ISLDQDSWKLENWTDASRV 


17 


1367 


A 


298 


68 


208 


RKRTNNPIKLDKKFEHFKNEDI* ITSKHTXM W " 
VSSLAMKEMLTKTTM 


18 


1368 


A 


300 


904 


1 


LWGITGTRHHARVIFIFLVETGFPHVGQAGL 
ELLTSGDPPALASQSAGITGMSHCARPKGHFG 
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SEQID 
NO: of 
nucl- 
eotide 
seq- 
uence 


SEiQiD ' 
NO: of 
peptide 
seq- 
uence 


Met 
hod 


SEQ 
ID NO: 
in 

USSN 
09/496 
914 


Predicted 

beginning 

nucleotide 

location 

correspond! 

ng to first 

amino acid 

residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue 
of peptide 
sequence 


OAspartic Acid, EKilutamic Acid, 
F=Phenylalanine, G=Grycine, H=Histidine t 
I-Isoleucine, KHLysine, D=Leucine, 
M«Metbionine, N=Asparagine, P=Proline, 
Q^Ghitamine, R=Arghine, S=Serine, 
T=Threonine, V=Valine, W=Tryptophan, 
Y«Tyrosine, X-Unknown, *«Stop codon, 

t— nncciM** rmrlf»htide deletion V^TJOSSiblB 

nucleotide insertion 














IHLK*MFYTMSQkMP*PTl^LILLUIPGNLNIF 
KPNMGWLGPK.TAFV*KDEVLSGIPFAKGRCR 
TO*DY*CyLQEVIT)PIMEKGKKKKRTASFFK 
nrsPHnQTKTAT T HRrVR+RYHLSVTVETAGLP* 
KNTGHIPGQPF1JFKLVFKC*NVICI* *QYKW*Q 
NIGVKNKSFCPH*SSSPSL*FIGHHSRNF/CSFK 
TEPHSVVQAGGQWRNLSSLQAPPPGLMPLSR 
ISLMSSWDYRRPPQ 


19 


1369 


A 


302 


3 


445 


"NSPSRW AK1QMFEHTFCG* GCG/ER/NVHIHCS 
WICRLRPLL WRA VRE YL SKLKN AELSFDPG V 
SLLRIYAIDMPTSI*DEKEALLFAFLAFHE*HC 
KSR1WAV1Q/CIHLWDWLRKL*CFHRMKFYA 
AV*NKPRHLLSHIWKDVQNILLK 


20 


1370 


A 


304 


1 


1339 


"FFFCGKEVPLFEQNKHPGPRATTSPGA/HARA 
LLSAGEFTAGVGLSP*AIHSFVWLCTFIQHGA 
GGPCHQPGGSPGPWMHTTQAGHLWEGAYPG 
GSSTWHQVPGQLGGSWGPRERSLLGSFIKCSP 
CPHPPGFRLWMSPNQKPPTENPGVMGRVWR 
LMPGESFLI W bAKtjls.liiJrlt*&A jduvjui jo.lv i v t\ 
PLHSSLGNTVKP*PKNQKPKQNRSRHGQ\GF 
MAGQOQSRPAAR*PPCPALTPASHSAGTWPP 
RICRTVPGGPCPSPSGFRSCRR*GFSA*TRSWP 
DAEPPSTPDTAPRCCTQSDTSSQGPQ*S*WRR 
CRALPGRLCSAPAAGLRRARPRLSESRRGNSP 
PASPAAASARCPSWGPSCPARPPSRPAAGTEP 
AAPSRCTAWLRGEREPGPRPPGRRPRSGRGP 
VSFAPEVLSLPAVRQTKS^WRNEEErTRPW 
ALVKSRUVj 


21 


1371 


A 


326 


799 


1587 


GSQVLPPPPSQDS ATLPQD A^OPKAArugf V u 

E*GLQGAGVRRLRGEVLCQPQP*GAL*EQCLP 

HLSFSPRQGAAPDTEPSAWGPAPTGATGPGLP 

LRHVRLFSAGAPRGAATPCPPALLHGPAWPP 

ARPMFRGHPPVRPLGPWGKVAAGPRALCLA 

r \/p a \rncsar A tic PSfl* ClL*PAHLRGPPGPEVL 

QWHWQLSAGRDFVPAEDPPL*EGPLGPGGPA 

AAQAEPGADPEPEDKDQAAESRPAGAMSLSA 

QGSGPVGGQGLR 


22 


1372 


A 


327 


146 


652 


PHLENPHPEHSFPGAPLT*STLSWS1LSPREPSP 
C* A PPVPfrT-TPHT FNPHLEHLLTWRTVTWSTLL 
PGAPCYPEHPHLEHPLTWSTPHLEHPSPGEPL 
SCRTPTRSILHRDHPLP*CLSTEESPI*GWGSLP 
APPSTPLVLDVAPPGPQPASSCPGRDSCYSVP 
GTWSP 


23 


1373 


A 


348 


397 


it 


" " CIVSSCCKjTRKPCMLEDANKINKQSPTLEKIES 
LQESL*VKQ*LIVAEKYVQILHPRKKYFQRPL 
Nl^KRKMKKIOCEEKKKCRERMQRRSKWRR 
EEKKE* RREE\EERKKEKEDRKERRXETSPRG 
SRRLLRD 


24 


1374 


A 


362 


170 


352 


" GRALDTAAGSPVQTAHGLPSDALAPLDDSMP 
WEGRTTAQWSLHRKRHLARTLLVSRVRGPQ 


25 


1375 


A 


384 


373 


128 


ylittileYgyLwknrhsdq*krtenperdqh 
kypkvdfcksnsmknrlcnkwhwtnwiftd 
• kkinlnlkphtkltpnikkn 


26 


1376 


A 


397 


383 


"T65 


" " EVKNTNPF1FSGTNLT1 WIRSI*RKSDEINQRTK 
*MEKYSISIJDRRLNTVKMSFLPNLrVKFNTISI 
KIPANF 


27 


1377 


A 


406 


103 


380 


KSKATGYMYW*KIJV\FLYANDEQLEIEMNK 
IWWGSKNKIAFimTKYQNIQNRHAENYKI 
LVNKJEDLNKWRNV^LSWIGRRNI INTMT 
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""T2eT 
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nucleotide 
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ng to first 
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acid residue 
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Amino acid sequence (A=Alanine G=0ysteme, 
D=Aspartic Acid, B-Glutamic Acid, 
F=Phenylalanine 1 G^Glycine, H=Histidine. 
Hsoleucine, K-Lysine, k=Leucine, 
M»Methionine, N=»Asparagine, peProline, 
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T=Threonine, V=Valine, W=Tryptophan, 
Y-Tyrosme, X-Unknown, *-«top codon, 
A=possible nucleotide deletion, V=possible 
nucleotide insertion 


28 


1378 


A 


408 


14 




TICrNKFNNLDEl^^^ 

JTLKTSRETELVINK*VIPHKEKPGPDSFTGEF 

YQTFKEEL/II/ILHKLFQTIKYGREPNSVYETSI 

TLKPKPEKDL\KENYRPLPLSNIDAK\LNKTLA 
NRI**HIR 


29 


1379 


A 


434 


395 


128 


IYSKMCMERQRLNN'OJaCNKVRGlAVPDVK " 

VYYKPTVIK^SWIL^KDSHIVEWNJ^ENLEID 

PN/IKRLILDKGAEATEWRKDSFFRQWO 


30 


1380 


A 


455 


2 


228 


FFFETESHSVlQAGVQWCNPGFKRFSCFGLSS " 

SWDYRYAPPRP\ANF\*FLVETGFYYVAQAGL 

KLLSPGDLPALAS 


31 


1381 


A 


4& 


3$3 


2 


QLMFDliOVlWmW<5Wl?Pnk*VWKNWiSi 
CRRMNLNPYLSRYIKINSRIKDLTVRPEriKLV 
EENTGKTIQDTGLGK*FIAKTSKAQSTKTNK* 

KRQTRYIKLK\KKSTASKENNRVKRQPLE*EK 
IFAN 


32 


1382 


A 


474 


125 


471 


VKPYEIAVFLVKPIEYK*HLLSDPAIPLSGI*LK 
EIKAYT/RRICTPMFAAPVSVIA/RN*KQSKy'CQ 
KQ*YVHRMEYYITIKRSEILICnTWVDFKNT 
ILRETDRIHKTTYDVISLI 


33 


1383 


A 


488 


1825 


2 


KSACSFlCSEEQPASPSPLiO>GTYASET\RPRDP 

HAAGPRRDSSEAETRRPRGA/DGSGTWKGT 

PGSPAPPCSWGHGG\ETEGAG'CPAAPGTDLR 

APGGSAGS *\GLPSAGGSRGRKG WRAAGRQP 

STR*GRPORHGGRGE*AGHPEPRQSALQSAG 

L/ASSPEPMGAALAEDGSGDSRGAGPRPQE*P 

PSVLSRS\GS*G*G*AASGTASSPRSHSSRLGPP 

SAGFHGLRCGQPPFAAAPPGPWPGTGRPAGG 

AGSPPAAAGTAPPATRGAQSRRQNRTAGRNA 

SPQTAAGAGSPVQWALSR4TG*TGETGSWC 

AGGTHQATHLTAAWVCPPTWSVRPGGSGPA 

AGLGR*GRHPAQSPPLPVPRG*PAWPQEAPSP 

SPASSEVALSSGSCWPDQAPGPARGSPPAPLA 

PAWPAAGRGRQR*GRQSAHPPPRR*STAVSL 

SGTS*WRRSP*AGTRTQQC*SPWLVPACSSRP 

L*RGTRRPSTQQSPQTTGTPGRSAGPGHPRS* 

GGRSPAGTGHLGAQTVASPH*GHWPTALSCL 

WASASPPGPEAPPQTGACIGTNCRYRAASAR 

RSSVAPACA*GWQ*AGSPPAVLRGPP*RVRER 

GALTHRPRAPDE 


34 


1384 


A 


497 


422 


2 


APGASVGRAQAAEG*RGGPTGRPPSALGVS/E 
AGRAGRAGEGRPVPPAYPLCKSAQTSGPPKA 
RLSVPPLASCGGRGPPGGAACATCAPPAGPAR 
SSRCRRRSPPE*GPR*PSRPARPSPGSAASRRQ 
KLTPCRCQFRGLCA 


35 


1385 


A 


509 


156 


475 


PTPYPGE^QAAFLLRGPGLRPPA/DPSLR/HRN 

LTELWAVTDENIVGLFAALLAERRVLLTAS 

KLSTLTSCDHAFCALLYPMRWEHVLIPTLPPH 
LLDYC* CPPLPRT 




1386 


A ~ 


512 


3 


1031 


FFFSFVCHLYCVSPTPGPHGRLATWL/PGLLA 

FLGLAAGGQTLCPAGELPGHARAQASGAPGS 

VLIAVPGRRRVHTCGPGPAAPSTRGECPPPAL 

GHTRPARPRPV\PFAPAVPQEPGGQGHGAA/P 

PATGHSAPRGCPPARAAPTGSATPAPPPAACA 

AFHSAWSVPPAGRQQG*RVPAPAFRRTTPGT 

PGQHLLDRPGAPPAQGSGPAPAPPPRLAGPA 

GPAAPPPGPPAASWHSSLSKSSSSUGWSPPLP 

VGPGSLQ*TPPPQGPHLSGSCGGTSSWRGQR 

AAYARRLRSWNACXiLSRVAGRSSASYPGRE 
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Amino acid sequence (A-Alanine OCysteine, 
CHAspartic Acid, E-Glutamic Acid, 
F=Phenylalanine. G=Glycine, H'Histidine, 
I=Isoleucine, K=Lysine, L=Leucine, 
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T=Threonine, V=Valine, W=Tryptophan, 
Y-Tyrosine, XHJnknown. *~Stop codon, 
/possible nucleotide deletion, \=possible 
nucleotide insertion 


• 












ORPSQSQ^AGPtGMRGCCLRGWFSissijSU 
GPGPHPASTWLRAGKTGPSPPACGCA*LPPPS 
VSAAPQSPRTRCPRGCAAAAGLCVLAAAGAS 
HG A\GLPG\^VHTQRVHIH 4 GAG/GCQTPRPR 
LRSLPVLGLPAPRCPVSAHPWHRRSGSSCHA 
ARLVPRHPAPGCP* *TG*\PLTTGFPEP*A*GLP 
NHQAVGLEASGALQAGHRDELPTMVQLLDH 
SPDYPLKGRPHAP 


37 


1387 


A 


620 


828 




"FRLPLAAGA/RGAAEPRVAVSMAPDPSAKJH 
WEASPEMQSKCHQKGKNNQTECFNHVRFLQ 
RLNSTHLYACGTHAFQPLCAAIDAEAFTLPTS 
FEEGKEKCPYDPARGFTGLUDGGLYTATRYE 
FRSIPDIRRSRHPHSLRTEETPMHWLNG^EDE 
AQDDGG* GTISSFLLPWPADHPTPKSPGEPVH 
SffVCCQVRGQPQSGGKESPACLKSLSNCLTH 
\DAEFVFSVLVRESKASAVGDDDKVYYFFTE 
RATEKESGSFTQSRSSHRVARGIPPL 


38 


1388 


A 


739 


I 


427 


FRAMVSSTLKLGISILNGGNAEVQ/QGNKUKU 
TSEEGKEG*EVPV*LPVSPPLPRPLQKMLDYL 
KDKKEVGFFQSIQALMQTC\GEKVMADDEFT 
QDLFRFLQLLCEGHNNDFQNYLRTQTGNTTT 
INIIICTVDYLLRLQESI 


39 


1389 


A 


767 ! 


I 


1030 


" TLDLTGPLLLGGVPNVPKDFRGRNRQi'XjGUM 
RNL S VIXjKKVDMAGFIANNGTREGC AARRN 
FCDGRRRQNGGTCVNRWNMYLCECPLRFGG 
KNCEQGEWP AS SCPPVTAAWEALLLDVPGTT 
VRGLHIQVRQPLVVYAAFTVDSHRPLQETVL 
RRAPAPASGVPSPSGVGWDR*AGPAEPSPSTP 
ATVIISVPWYLGLMFRTRXKEDSVLMEATSGG 
PTSFRLQVTOAPCHQGTC*VGARGR0PMLSG 
LRVTDGEWHHLLIELKNVKEDSEMKHLVTM 
TLDYGMDQVSWHLHLLWG*TLPPAQGKTGA 
SEDKVS VRRGFRG CMQVRGGCGGRGEACPS 
QAAPRL 


40 


1390 


A 


801 


69 


399 


"1HKIHHKEDLNKWKYILCSGMERLSTYM1P V V 
PQ1IYKFNA*Q\VILKFT\V*E* GAKIULRKNKL 
RGLVLWLSTC*VKYLLDKVLPimCTYYEAR 
VNKSWLVQVTIM 


41 


1391 


A 




1 / 




" SMLKERKVFQFPSCLFFQYITWLGPPYHYLhU 
SSVTNFSIGAK*D1LQSVMNCLYAKRIPCVT 


42 


1392 


A 


841 


1 


415 


" "'GSTHASGYDKTPDFILQVPVAVEGHIIH WlliS 
KASFGDECSHHAYLHDQFWSYWNSLKHRTW 
QGIGTVASNLSQL+TLNAPFPELLLFRSLARTG 
FVLT*VRFGPGLVIYWYGFIQELDCNRERGILL 
KACFPTNIVTL 


43 


1393 


A 


845 


358 


92 


" PALSPAPVPQKKGSPLPLDPCLGPSSWLLSYU 
LGWPRL*PRRGPGDPGSLPATPPLLTPPHTLLP 
ORPMLPPSHAGLARPPPPEPISVP 


44 


1394 


A 


853 


452 


1 


" " LPQYCFFPRLSPKSKLVKHSAL**PSALKPPTK 
SPRCIPRTSLYFTICC/PPALQUSPIEDPPAIYRS 
PPTHMLRSASQPLNQAPTLVKGHPPSRFLQG 
QVSCPPQPTLPREKPLPLHLRPPPRPAQPPLPR 
PLTFSTRRKVDPEIPERFR 


45 


1395 


A 


894 


379 


162 


" GVYPPTVFDNYSVQTSVDGQIVSLNTWDTAG 
QEEYD/RLRTLS*PQTS1FV1CFSIGNLEFPIYGT 
WLSMSMGK 


46 


1396 


A 


900 


1 


366 


" TTKKTLISNWSSRSLPILPELKAFSLAFNDPL 
EIQKYMRT/DQ* CVPTHDISLYIVTKLALIFLIPR 
VT , I.FHQLNIT* c *a.HFFTMTTFIAiPFSFLFLGR 
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D-Aspartic Acid, E-Glutamic Acid, 
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D/KSLAMLFRLVSNSWPQVILPP 


47 


1397 


A 


944 


162 


2 


QLQNLASRGCL* SQLLRRLRKENRLNPGGGG 
CSEIAP\CTPAWVTQRDFFRKKK 


48 


139S 


A 


963 


216 




rir 1 rURIAlVKNTRDSHCWROC*EEGAPARC 


49 " ' 


1399 


A 


967 


466 


1 


PRKRESWWGERLP/PRGFPPAAEDAPAPGWK" 
GRKHASRTARAHVFHPIRQSIRSPVRGRPGDP 
RAAHTRSAGTRLQCKASRGG*GKGPAPTR* E 
GGPGSAPAPLPASSGCSLFPDSSPWTPPPPAPG 
AAAAQP* ♦TPRCPAALRAGAH1GRVGRPY 


50 


1400 


A 


973 


45 


421 


EKCIQALD\ r FVFCYIDHSSHCLMSCD+E/DQA" 
LNFMPLEMEPKMSKLAFGCQRSSTSDDDSGC 
ALEEYAWVPPGLRPEQIQLYFACLPEEKVPY 
VNSPGEKHRIKQLLYQLPPHDNEVRYCQSLSE 


51 


1401 


A 


992 


2095 


194 


IRIRHEAARSCLGCAAGHVPAPGLRJLXPTVRG* 

PPGRRGPAAPGCVCY* SGESTFVSHVPQRMA 

WPGSAPPRGFHPLQSQTSPSDTVSSPQLSKEE 

DGPGWEHPLSSSL*SLGQAGGNH*QPEELAG 

WEPRGPPSIAPSSPT/TMWTALVLIW1FSLSLS 

ESHAASNDPRNFVPNKMWKGLVKRNASVET 

VDNKTSEDVTMAAASPVTLTKGTSAAHLNS 

MEVTTEDTSRTDVSEPATSGVAADGVTS1APT 

AVASSTTAASITTAASSMTVASSAPTTAASST 

TVASIAPTTAASSMTAASSTPMTLALPAPTST 

STGRTPSTTATGHPSLSTALAQVPKSSALPRT 

ATLATLATRAQTVATTANTSSPMSTRPSPSKH 

MPSDTAASPVPPMRPQAQGPISQVSVDQPW 

NTTNK5TPMPS>TTTPEPAPTPTVVTTTKAQAR 

EPTASPVPVPHTSPIPEMEAMSP1TQPSPMPYT 

QRAAGPGTSQAPEQVETEATPGTDSTGPTPRS 

SGGTKMPATDSCQPSTQGQYMV/DHH*APHP 

GRGRQNSPSGGAVTRGDPFHHSLGFVCPAGL 

♦ELQEEGLHPGGLLNQRDVCGLRNVRGAGA 

WREAWPLPRPFLLPLRPNQVLPNSFGAIEEIC 

QMLKHI 


52 


1402 


A 


"994 


1 


462 


ESGEFLVSFTLKKPTNVFHHINGMKFFNK/LIF 
*SHTDIAFYKIQHPFMLKALTKWA*EGT*PDR 
RYLH* SLRLNGEQLKTFPLRSGNfR*G/CAlLPL 
VLNAMLSIVPA WP AGKTR HEKFTTCP] JGQE 
hK*r b*b VGDMNTCVENKKESKKLLE 


53 


1403 


A 


1011 


1 


630 


PEVIQQSAYDSKAt)IWSLOlTAIELAKGEPPN$ 
DMHPMRVLFLIPKNNPPTHCWRRLLESFKEV 
*LMLA*TKDPSI\RPTAKELLKHKFIVKNSKKT 
a r L. i EislUKrKK w K AEGHSDDRSDSEGSDSES 
TSRENNTHPEWSmVRKKPDPKKVQNGAEQ 
DLVQTLSCLSMUTPAFAELKQQDENNASRNQ 
AIEELEKSIAVAEAAGPG 


54 


1404 


A 


1016 


1 


222 


ISIDA*KAFDKIQH/CFMITTLKKLGIDGKYLN 
TIKAIDDRHTVSTIliNVEKLK AFL+RSGTRQRF 
P1SGSGARI 


55 


1405 


A 


1033 


3 


366 


HASVDGDEGSDDVYYYYTPAILRELQALNTA 
EAAEHRPEEDRMLSEDPWRPAHMEKGYMPL 
HNIPHTEVIDVTGLNQSHLYQHLNKGTPMKT 
QKRAANLYTWHVLEQLEILRQINQQSHGPG 


56 


1406 


A 


1044 


5 


429 


SVLTLQTRSPSKPLSUUaMDWEWSRNSISE " 

DRLETQSRASRSPPVTPNQSQETPVDGKPLAL 

PPNQSQKNIRYHIHYLHLQYYLDRHISATLPIP 

SSSGIPTPIAVjTDALTDLVELILGQPCSEESGR 

APGTLFLLAL 
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57 ! 


1407 


A 


1050 


11 | 


430 


GAYAFETOGFPIMLVLiTDKJEGDVGiAGLYD 
MH\ISLPMAFLLRTLVRCTSYIIPVTHVLSTPV 
TCLRRREKDGV1VDVLSDTASNHNGFPVEEH 
ADDTHPARLQGPTLRSQPMGPLKHKAFEERA 
NLGLVQRRLRLED 


58 




A 


111 JO 


258 


419 


LKHRDTP WGANNRALiiCTPLTSLTLCALCPL 
PCLGCPTXATCRLYQTTVAWF 


59 


1409 


A 


1064 


3 j 


425 


"KAFSFrrSLIGHQRNfflTGERPYKCKJiCGlCl P 
KGSSSLNNHQRIHTGEKPYKCNECGRAFSQC 
SSLIQHHRIHTGEKFYECTQCGKAFTSISRLSR 
HHRIHIX3EKPFHCNECGKVFSYHSALIIHQRIH 
TGEKPYACKDVGK 


60 


1410 


A 


1065 


204 


419 


GGPPGPFtAHl^JtiAOLQAPGPLLAPAGDEGDL 
LLLAVQQSCLADHLLTASWGGK/DPIPTKALG 

EGQEGLPLTV 


61 


1411 


A 


1079 


3 


383 


RHSRAHLCX^PFHLVMRDLLQLGQDIPQGCHY 

LEENHLMUDIAARNCLLSCAAPTRAATIGDE 

GMARYIYRTRYYQLGDRAL/LPRKWMPPEAL 

LEGIFTYNTDSWTFGVLLWE1FSLGYMPYPGR 

TO 


62 


1412 j 


A 


1080 


1 


ojy 


WEFLWSRRPSGSSDPRPRRPASKCQMMJblSR 

ANLMHMMKLSIKVLLQSALSLGRSLDADHA 

PLWFFVVMEHCLKHGLKVKKSFIGQNKSFF 

GPLELVEKLCPEASDIATSVRNLPELKTAVGR 

GRAWL YL ALMQKKLAD YLKVLIDNKHLL SE 

FYEPEALMMEEEGMV1VGLLVGLNVLDANL\ 

CLKGEDLDSQVGV1DFSLYLKDVQDLDGGKE 

HERITDVLDQKNYVEELNRHLSCTVGDLQTK 

IDGLEKTNSKLQERVSAATDRICSLQEEQQQL 

REQNELER 


63 


1413 


A 


1083 


2 


615 


SSFAKHKRIHTGEKPFICLtCGKAFTSSTTLl K 

HRRIHTGEXPYTCEECGKAFRQSAILYVHRRI 

HTGEKPYTCGECGKTFRQSANLYAHKKIHTG 

EKPYTCGDCGKTFRQSANLYAHKKIHTG\EKP 

YKCKECGKAFKSYYSILKHKRTHTRGMSYEG 

DEC/QRSLN/RS SIL SNHKI1HN EEK/PLKCEKCE 

KAFNHTSICCRHKKN 


64 


1 1414 


1 A 








' ' KKQDLSSSLTDDSKNAQAPLALTESHLA1 LA 
SSSQSPEAIKQLLDSGLPSLLVRSLASFCFSHIS 
SSESIAQSJDISQDKLRRHHVPQQCNKMPITAD 
LVAPILRFLTEVGNSH1MKDWLGGSEVNPLW 
TALLFLLCHSGSTSGS\HNLG\AQQDQCKISFS 
FreWLTTGLTTQQRTAIE\NATVAFF\LQCPiSC 
HFNNQKLMAQVLCELFQTSPQRGNLPTSGNI 
SNGFIRVRLFLQLMLEDEKVTMFLQSPCPLYKG 
RINATSHVIQHPVMYGAGHKFRTLHLPVSTTL 
SDVLDRVSDTPS1TAKLISKQKDDKKKK 


65 


1415 


A 


1087 


103 


324 


7 TrafSfvhtemivg^vqnihlftlqvledra 
ijtmsvgsslwstyl1hvmalp/drellkpna 
svalhklsnalv 


66 


1416 


A 


1095 


3 


493 


HETCS VTHIVSr SLPr LNrarir Aa I run l cr* n,^ 

PSLVWFDRGKFYLTFEGSSRGPSPLTMGAQD 

TLPVAAAFTETVNAYFKGADPSKCIVKITGE 

MVLSFPAG1TRHFANNPSPAALTFRVINFSRLE 

HVLPOTQLLCCDNTQNDAOTKXEFWVNMPNL 


67 


1417 


A 


1098 


57 


356* 


" TiaTSLGFllGVSVVGNLLlSlLLVKX>Kl LHKA 
PYYFLLDLCCSDILRSAICFPFVFNSVKNGST 
WTYGTLTCKVIAFLGVLSCFHTAFMLFCISVT 
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M=Mcthionuic, N=Asparagine, P^Proiine, 
Q=Glutamine, R=Arginine, S=Serine, 
T=Threonine, V=Valine, W«Tryptophan, 
Y»Tyrosine, X«Unknown, *«Stop codon, 
/=possible nucleotide deletion, V=possible 
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RYL 


68 




A 

> 


« t r\r 


i 




MGKISATOINMGTKCSWALVWHLESYDPICH 
YEREGMQDWKTASOQSEEATQQSSQKPQPH 
YTTYQSSSFLKYSSESHLLAWRENSSEGSFQF 

PGRSRARPPRTRQQRRGAAAGPGRGAVRLG 
HPQSAAOPOLRAA APJPFSPA AFPAOPPPr.c a 

RNSDASGPASLSRTLGRASSPRPPQAPDVTAP 
SPAALAPRAARGGSRAAALAGAEAEEPLRTL 
APRPTRAAAPPPPPPPPPLPPGAPPPPVRCVSR 
RARAPPWR/PAATGPPP\RPVAPSRKLGSARAP 
APALOIRKGTSSGLPGRGGG^fiPnisrMT <iqva 

GNWRGSSFAVERPGMAKYQGEVQSLKLDDD 
SVIEGVSDQVLVAVWSFALIATLVYALFRNV 
HQNIHPENQELVRVLREQLQTEQDAPAATRQ 
QFYTDMYCPICLHQASFPVETNCGHLFCGSLT 
PNSIW 


69 


1419 


A 


1107 


2 


466 


FDTARLHEFGTSITQIFAVDNREDLQKWMEA 
FWQHFFDI^QWKHCCEELMKIEIMSPRKPPLF 

TNGQFLIGQREESLP/SS/CGPHSLMVTIKWSS 
RKRY/SYPASEPLHDEKGKKRQAPLPPSDK 


70 


1420 


A 


1111 


"698 


23 


ALRRLHYVRATKVNFLSFRRPFWREEHIEGGH ' 
SNTDRPSRMIFYPPPREGALI T A<IVTW<:n a a a 

AFAGLSREEALRIALDDVAALHGPWRQLW 

DGTGVVKRWAEDQHSQGGFWQPPALWQT 

EKDDWTVPYGRIYFAGEHTAYPHGWVETAV 

KSALRAAIKINSRKGPASDTASPEGHASDMEG 

QGHVHGVASSPSHDLAKEEGSHPPVQGQLSL 

QNTTHTRTSH 


71 


1421 


A 


1119 


2 


385 


QKQTLQNGYLDSSMDILYLGSLPPELQVSSDE 
PPGPPEQAGLSQFHLEPETQNPETTEEIQSSVLQ 
QEAAAQLPQLPEWELSSTKA\EAPALPSQSL 
EGVHSSTEOTCAPAOOT PAFPFll APT r juxxc 


72 


1422 


A ■ 


1127 


1 


906 


^QVVOPYRLERTLGKGO'rOLVRLGVriClT 

GQKVAIKIVNREKLSESVLMKVEREiAIL\RLI 

EHPHVLKLHGVYENKKYFPPDELTSGPSMLA 

QVSPHGKLSARRSWDLLSGFPRYLVLBHVSG 

GELFDYLVXKGRLTPKEARKFFRQ1VSALDFC 

HSYSICHRDLKPENLLLDEKNNIRIADFGMAS 

LQVGDSLLETSCGSPHYACPEVIKGEKYDGR 

RADMWSCGVILFALLVGALPFDDDNLRQLLE 

K\^GVFHMPHrTPPDCOSlXRGMTFVFPFTfP 

LSLEQIQKHPWYLGGNFIS 


73 


1423 


A 


1128 


1 


802 


LRNALDVLHREVPRVLVNLVDFLNPTIMRQV 
FLGNPDKCPVQQA/MLEPLGSKTETLDLRAE 
MPITCPTQNEPFLRTPRNSNYTYPIKPAIENWG 
SDFLCTE WKASNS VPTS VHOLRPADTK W a a 

LGDSLTTAVGARPNNSSDLPTSWRGLSWSIG 

GDGNLETHTTLPNTLKKFNPYLLGFSTSTWEG 

TAGLNVAAEGARARDMPAQAWDLVERMKN 

SPDINLEKDWKLVTLFIGGNDLCHYCENPEA 

HLATEYVQHIQQALDILSE 


74 


1424 


A . 


1139 


60 


480 


FREPCLLVPGDHQPLREASWLA/LPPIGLWGT ' 

DSPLCCVEVAIPCNKGAHSVGLKGWLLAQG 

VLGMRDTIPQEHPWESTPDLCFCRDPEEIEVE 

EQPAADAAVAKGEF/QGEQIAPVPAMIAAHPE 

AADPAPVHTTAHPKGA 


75 


1425 


A 


1147 


2 


413 


PFPHQHPQEP\KGSCWPQSALRGQCPGPVLGV 
TTTSDLCSLQVPVSSHRNPLLDLAAYDQEGR 
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SEQID 1 
NO: of 
nucl- 
eotide 
seq- 
uence 


SEQID 
NO: of 
peptide 
seq- 
uence 


Met 
hod 


SEQ 
ID NO: 
in 

1 1COXT 

09/496 
914 


Predicted J 

beginning 

nucleotide 

location 

correspondi 

ng to first 

amino acid 

residue of 

peptide 

sequence 


Predicted end 1 * 

nucleotide 

location 

p/iifpcnririrl i no 1 

to last amino 
acid residue 
of peptide 
sequence 


Kmmo acid sequence (A=Alanine OCysteine, 1 
OAspartic Acid, E-Glutamic Acid, 
^Phenylalanine, G=Glycine, H=Histidine, 
NIsoleucine, KHLysine, L=Leucine, 
M«Methionine, N-Asparagine, P-ProUne, 
Q<Jlutamine, R=Arginine s S=Serine, 
T^Threontne, V=Valine, W=Tryptophan, 
Y-Tyrosine, X-Unknown, *~Stop codon, 
/-possible nucleotide deletion, Hk>ssible I 
nucleotide insertion 














RFDNFSSLSIQWESTRPVLASIEPELPMQLVSQ 1 

DDESGQKKLHGLQAILVHEASGTTATTATAT 

GYQESHLSSAR 


76 r 


1426 


A 


1155 


38 


410 


P1ISAPAQDDPILLSFIHCLHANLLCVWRRDVK 

PDCKEIWIFWWGDEPNLVNVQYIMNCMLWK 

KDSGKMAFPMNVGRCTFFKETii^LERCLMD 

KNFVLIGKWFVRPYYKDEKPV>^SEHLSCAF 

T 


77 | 


1427 


A 


1162 


526 


350 


RFPQGLEDVSTYPVLIEELLSRGWSEEELQUY | 
LRGNLLRVFRQVEKVQEENKWQSPLED 


78 ! 

- 


1428 


A 


1171 




1293 


MAESASPPSSSAAAPAAEPGVTTEQPGPRSFF 

SSPPGLEEPLDGADPHVPHPDLAPIAFFCLRQT 

TSPRNWCIKMVCNPWFECVSN1LVILLNCVTL 

GMYQPCDDMDCLSDRCKILQVFDDFIFIFFA 

MENfVLKMVALGIFGKKCYLGDTWNRLDFFI 

VMAGMVEYSLDLQNINLS.\IRTVRVLRPLKA 

mRVPSMRILVNLLLDTLPMLGNVLLLCFFVF 

FffGIIGVQLWAGlIJRl^RCFLEENFnQGDVAL 

PP\YYQPEEDDEMPFICSLSGDNGIMGCHEIPP 

LKEQGRECCLSKDDVYDFGAERQDLNASGL t 

CVNWNRYYNVCRTGSANPHKGAINFDNIGY 

AWrVffQVrTLEGWVEIMYYVMDAHSFYNFI 

YFILLIIVSVREPGLLGGSFSTAQSPKCQGDSFP 

GVAAESLLLRGWVXWLPGGG 1 


79 


1429 


A 


1175 


1 


405 


PNDFFKDMFPDLPGCPLGPnCAENDYUAYLN 
FLSATHLGGLFPPWPLVEERKLKPKASQQCPI 
CHKViMGAGKLPRHMRTHTGEKPYMCTICE 
VRFTRQDKXKJHMRKHTGERPYLCIHCN AKF 
i VHNYDLKNHMR 


80 


1430 


A 


1182 


25 


198 


rEMNELSQQLSQQG<jRGASQCPSPPAPTLPNPI 
PLCOLOLORVKTGLPTPPCHPGAGAA J 


81 


1431 


A 


1186 


254 


583 


KTVLDVGAGTGILSIFCAQAGARRVYAVEAS 
AIWQQAREWRFNGLEDRVHVLPGPVETVEL 
PEQVDAIVSEWMGYGLLHESMLSSVLHARTK 
VVKDGGFFLPXS SELFM _j 


82 


1432 


A 


1187 


2 


716 


DFVDAARNLPLESTKSPAEPSKSVPSLEVDFKA 
SSQGLPSQGPVQNQGRRGEQRPKKF/TVTQHT 
SSFEKSDSLEQPSGLEGEDKPLAQFPSPPPAPH 
GRSAHSLQPKLVRQPN1QVPEILVTEEPDRPD 
TEPEPPPKEPEKTEEFQWPQGSQTLAQFPVEK 
LPPKKKRLGLAKMAQSSGES SFESS VPLFRSP 
SQESNVSLSGSSRSALFERDDHGKAEAPSPSF 
1 DMGPKPLGTHMLTV 


83 


1433 


A 


1188 


517 


804 


ESPGLSKVLRTGAFAYPWJPDKLPLFYRLGLC j 

WGRGHGCGQEALSTSHGYHLFCALLTGFLFA 

SHLPER1APGRFDYIGHSHQLFHICAVLGTHF 


84 


1434 


A 


1192 


" 45 


476 


LGDVGFWVERTPVHEAAQRGESLQLQQLIES 
GACVNQVTVDS1TPLHAASLQGQARCVQLLL 
AAGAQVDARNIDGSTPLCECLRLGQHRVCEA 
1 LAVLRGQGQPSPVHSVPPARGLHXREFRMC* 
GFLFDVGXNLEAHEFHFGEP 


85 


1435 


A 


1194 


" 69 


410 


krseeasappfplggtgaaptraslpeqili.fr 
sclearksqpdekllsalhnsrtwn* eprrs q 
hrlvspevhpgrrgsspgvaeckltsayfrt 
grspcpslpgttrtnsll 


86 


1436 


A 


1215 


_ 3 


405 


LPSHTCGNPGRLPNGIQQGSTFNLGDKVRYSC 
NLGFFLEGHAVLTCHAGSENSATWDFPLPSC 
1 R ADD ACGGTLRG/ AEWHHI .QPPLPLG/ATKN 
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SEQ ID 
NO: of 
nucl- 
eotide 
seq- 
uence 


SEQ ID 
NO: of 
peptide 
seq- 
uence 


Met 
hod 


SEQ 
ID NO: 
in 

USSN 
09/496 
914 


Predicted 

beginning 

nucleotide 

location 

correspondi 

rig to first 

amino acid 

residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue 
of peptide 
sequence 


Amino acid sequence (A=Alanine OCysteine, 
D-Aspartic Acid, EKHutamic Acid, 
F=Phenylalanine, G=Grycine, H=Histidine, 
I=IsoIcucine, K^Lysine, LHLeucine, 
M=Memionine, N^Asparagine, P=Proline, 
Q=Glutamine, R=Arginine, S=Serine, 
^Threonine, V=Valine, W^Tryptophan, 
Y«Tyrosine, X-Unknown, *^top codon, 
/^possible nucleotide deletion, \=possible 
nucleotide insertion 














nadctwtilaelgdtialvfidfqledgVdfL 

bVTGTEGSSLW 


87 


1437 


A 


1216 


226 


964 


GTARFGPMVGFGANRRAGRLPSLVLQVLLV 
VimAHWVSISSRHVLLQEEVAELQGQVQ 
RTEVARGRLEKRNSDLFAWGHAQETDRPEG 
GRLRPPOOPAAGORGPREEMXEDDKVia OMM 

ISYQMADIHHLKEQLAELRQEFLRQEDQLQD 
YRKNNTYLVKRLEYESFQCGQQMKELRAQH 
EENKKLADQFLEEQKQETQKIQSNDGKELDI 
NNQWPKNIPKVAENVADKNEEPSSNHIPHG 


88 


1438 


A 


1218 


1 


534 


PEFGTTISCGYLMATDVSRKPSVHKAVE1EQE 
RVKSAGAW1IHPYSDFRFYWDL1MLLLMVGN 
Lrv^VGITFFKEENSP\PWIVFNVLSDTFFLLD 
L VLNFRTGI WEEGAEIf XAPR ATRTR YT RTW 

FLVDLISSIPVDYIFLVVELEPRLDAEVYKTAR 
AXRIVRFTKILSLLRL 


89 


1439 


A 


1223 


1 


743 


MGFDEVFMINLRRRQDRRERMLRALQAQEIE 

CRLVEAVDGKVGMLTRSNAAPfiUHT AMT FT 

LVWAPRFVDADNLILNPDTLSLLIAXNICTVV 

APMLDSRAAYSNFWCGMTSQGYYKRTPAY1 

PIRKRDRRGCFAVPMVHSTFLIDLRKAASRNL 

\AFYPPHPDYTWSFDDIIVFAFSCFCQ\AEVQMY 

VCNKEEYGFLPVPLRAHSTLQDEAESFMHVQ 

LEVMVPSSPSSAQSMAWSADHIGLVISYL 


90 


1440 


A 


1227 


2 


349 


MCTSFIFYLKNIWAI>LIMTLTFPFRIVHDAGF 
GPWDFKFILCRYTSVLFYANMDTSIWLGLrr/ 
YDRY/WKVVRHLAVDSWMTG1/SFTRVYLLG 
LG ARL V WFGKLI LAKGGHGGIS WL 


91 


1441 


A 


1245 


3 


1937 


LGSSDVRAPQRSELGAESPSRMVASQAYNLT 

SALTPILTRSRVLNEEPLTLAGF\SRAPANLSD 

WQLIFLVDSNPFPFGYISNYTVSTKVASMAF 

QTQAGAQIPIERLASERAJTVKVPNNSDWAAR 

GHRSSANSVWQPQAFVGAWTLDSSNPAAV 

LHLQJUm-LLDGRYLSEEPEPYLAVYLHSEPR 

PNEHNCSASRRIRPESLQGADHRPYTFFISPGT 

RDPVGSYRLNLSSHFRWSALEVSVGLYTSLC 

QYFSEEDVVWRTEGLLPLEETSPRQAVCLTR 

HLTAFGTSLFVPPSH1RPYFPEPTADVNYIVML 

TCAVCLVTYMVMAAILHK1 DOT DA<TRGR ATP 

FCGQRGRFKYEILVKTGWGRGSGTTAHVGIM 

LYGVDSRSGHRHLDGDRAFHRNSLDIFQIATP 

HSLGSMWKIRVWHDNKGLSPAWFLQHIIVRD 

LQTARSTFFLVNDWLSVETEANGGLVEKEVL 

AASKASFRVPTPSVAALLRFRRLLVAELQRGF 

FDKHIWLSIWDRPPRSCFTRIQRATCCVLLICL 

RGANAVWYGAVGDSAYSTGRVSRLNPLSV 

DTVAVGLVSSWVYPVYLAILFLFRMSRSKV 

GWGWGPGSTGNGAWASAPCPEPPLSSAAAR 

GKGVHQRULGKGQHT 


92 


1442 


A 


1246 ! 


5 


562 


VFDEENTLNELNDPLREEIVNFNCRiaVATMP 

LFANADFNFVTAMLSKLRFEVFQPGDYIIREG 

AVGKKMYFIQHGVAGVITKSSKEMKLTDGS 

YFGEICLLTKGRRTASVRADTYCRLYSLSVD 

NFNEVLEEYPMMRRAFETVAIDRLDRIGKKN 

SILLQKFQKDLNTGVFNTVQENEILKQIVKH 


93 


1443 


A 


1249 [ 


180 


901 


TVPPPPGGPSPAPLHPKRSPTSTGEAELKEERL 
PGRKASCSTAGSGSRGLPP^SSPMVSSAHNPN 
KAEIPERRTOSTSTP^PPSMMTRRNTYVCT 
ERPGAERPSLLPNGKENSSGTPRVPPASPSSHS 



142 



WO 01/57188 PCT/US01/03800 



SEQ ID 
NO: of 
nucl- 
eotide 
seq- 
uence 


SEQ ID 
NO: of 
peptide 
seq- 
uence 


Met 
hod 


nnr\ 

D-bQ 

BONO: 

in 

USSN 
914 . 


rrCQlClCU I 

beginning 
nucleotide 
location 

vUITCbpifllUl 

ng to first 
amino acid 
residue of 
peptide 


r j cu iwtu uuu 

nucleotide 
location 
corresponding 
to last amino 
acid residue 
of peptide 
sequence 


Amino acid sequence (A=Alanine 6=Cysteine, 
D-Aspartic Acid, E^lutamic Acid, 
F=Phenyialanine > G=<Jlycine, H=Histidine, 
I«Isolcucine, K=Lysine, L^Leucine, 
M=Methionine, N=Asparagine, P=Proline, 
Q=Glutamine, R^Arginine, S=Serine, 
Threonine, V-Valine, W=Tryptophan, 
Y=Tyrosine, XHJnknown, *»Stop codon, 
HJ<>ssiW e nucleotide deletion, V=possible 
nucleotide insertion 














LAPPSGERSRLARGSTIRSTFHGGQVRDRRAG 
GGGGGGVQNGPPASPTLAHEAAPLPAGRPRP 
ITr^FTKLTSKLTRRVADEPERIGGPEVTRRP 
RQEDHLSPGGRGCSEL 


94 


1444 


A 


1261 


3 


385 


KFSQWGLTKPKLSNASPAVISLVKKLMKKWS 
VTQNLTFREQLEAGIRYFDLRVSSKPGDADQ 
EOTIHGLFGIKVWDGLMEIDSFLTQHPQEIIFL 
DFNHFYAMDETHHKCLVLRIQEAFGNKLCPA 

CR 


95 


1445 


A 


1282 


2 ! 


550 


GPRDNPG\EDPRFEIVEHFGIAWFTFELVARFA 

VAPDFLKFFKNALNLIDLMSIVPFYITLWNL 

VVESTPTLANLGRVAQVLRLMRIFR1LKLARH 

STGLRSLGATLKYSYKEVGLLLLYLSVGISIFS 

WAYTIEKEENVEGLATIPACWWWATVSMTT 

VGYGDWPGTTAGKLTASACILA 


96 


1446 


A 


1294 


1 


1456 


" QLLPPSNRENAGLLVGRCLCSAALRPVGDU I 
SSGQVAVRNAPQAGSAKAGKGKFQDNFEFIQ 
YFKKFFDANCNEKDYNPVAAGQGQETEVAP 
SIYAPVLNKPNQCPEGY1CVKAGRNPNYGYT 
SFDTFSWAFLSLFRLMTQDYWENLYQLTLRA 
AETTYN1IF/LV/LVILLGSLYLVTLILAVA^ AMA 
YEEQNQATLEEAEQKEAEFQQMLEQLKK.QQ 
EAAQQAAT ATASEHSREP S AAGRLSDS S SEAS 
KLSSKSAKERRNRRKKRKQKEQSGGEEKDED 
EFQKSESEDSIRRKGFRFSIEGNRLTYEKRYSS 
PHQSLLSIRGSLFSPRRNSRTSLFSFRGRAKDV 
GSENDFADDEHSTFEDNESRRDSLFVPRRHGE 
RRNSNLSQTSRSSRMLAVFPANGKMHSTVDC 
NGWSLVGGPSVPTSPVGQLLPEVIIDKPAT0 
DNGTTTETEMRKRRSSSFHVSMDFLEDPSQR 
QRAMS1ASILTNTVE 


97 


1447 


A 


1295 


2 


2057 


IQTQLPTKSSQQLRKGGNCVRCKMQMNF1AE 

EVLLKYRITFYNNNKGPNMLYIEIKAFVHFMI 

NRYLSYGSGPKRFPLVDVLQYALEFASSKPV 

CTSPVDDIDASSPPSGSIPSQTLPSTTEQQGALS 

SELPSTSPSSVAAISSRSVIHKPFTQSRIPPDLP 

MHPAPRHTTEEELSVLESCLHRWRTE1ENDTR 

DLQESISRIHRTIELMYSDKSMIQVPYRLHAV 

LVHEGQANAGHYWAYIFDHRESRWMKYNDI 

AVTKSSWEELVRDSFGGYKNASAYCLMY1N 

DKAQFLIQE\DLIKTGQPLVGIETLPPDLRDFV 

EEDNQRFEKELEEWDAQLAQKALQEKLLAS 

QKLRESETSVTTAQAAGDPKYLEQPSRSDFSK 

HLKEETIQIITKASHEHEDKSPETVLQSAIKLE 

YARLVKIj\QEDTPPETDYRLHHVVVYFIQNQ 

APKK1IEKTLLEQFGDRNLSFDERCHNIMKVA 

QAKLEMIKPEEVNLEEYEEWHQDYRKFRETT 

MYLIIGLENFQRESY1DSLLFLICAYQNNKELL 

SKGLYRGHDEELISHYRRECLLKLNEQAAELF 

ESGEDREVNNGLIIMNEFIVPFLPLLLVDEMEE 

KDILAVEDMRNRWCSYLGQEMEPHLQEKLT 

DFLPKLLDCSMEIKSFHEPPKLPSYSTHELCER 

FARIMLSLSRTPADGR 


98 


■ 1448 


A 


1304 


118 


453 


SGPSSRATYLHRKEYSQNLTSEPTLLQHRVEH 
LMTCKQGSQRVQGPEDALQKLFEMDAHGRV 
WSQDLILQVRDGWLQLLDIETKEELDSYRLD 
S1QAMNVALNTCS YNS1L S 


99 


1449 


A 


- 1306 


3 


1660 


" CgyfchttCApqappCpvppdllrtalgvhpe 
tgtgtayegflsvprpsgvrrgwqrvfaals 
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SEQ1D 
NO: of 
nucl- 
eotide 
seq- 
uence 


1 SEQID 
NO: of 
peptide 
seq- 
uence 


Met" 
hod 


SEQ 
ID NO: 
in 

USSN 
09/496 
914 


Predicted 

beginning 

nucleotide 

location 

correspondi 

ng to first 

amino acid 

residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue 
of peptide 
sequence 


Amino acid sequence (A-Alanine t-Oysteine, 
D-Aspartic Acid, E-Glutamic Acid, 
F=Phenylalanine, G=Glycinc, H=Histidine, 
Hsoleucine, K=Lysine, I>Leucine, 
M«Methionine, N=Asparagine, P^Proline, 
Q=Glutaraine, R=Arginine s S=Serine, 
TMTireonine, V=Valine, W=Tryptophan, 
Y°Tyrosine, X-Unknown, *-Stop codon, 
/^possible nucleotide deletion, V=possibie 
nucleotide insertion 














DSRLLLFDAPDLRLSPPSGALLQVLDLRDPQF 

SATPVLASWmQSRDLPRIFRVTTSQI/AVPP 

TTCTVLLLAESEGERERWLQVLGELQRLLLD 

ARPRPRPVYTLKEAYDNGLPLLPHTLCAAILD 

QDRLALGTEEGLFVIHLRSNDIFQVGECRRVQ 

QLTLSPSAGLLWLCGRGPSVRLFALAELENI 

EV\EVPKIPESRGCQVLAAGSILQARTPVLCVA 

VKRQVLCYQLGPGPGPWQRRIRELQAPATVQ 

SLGLLGDRLCV GAAGGFALYPLLNEAAPLAL 

GAGLVPEELPPSRGGLGEALGAVELSLSEFLL 

LFTTAGlYVDGAGRKSRGrtELLWPAAPMGW 

GYAAPYLTVFSENSIDVFDVRRAEWVQTVPL 

KKWRPLNPEGSLFLYGTEKVRLTYLRNQLAE 

KDEFDIPDLTDNSRRQLFRTKSKRRFFFRVSE 

EQQKQQRREMLKDPFVRSKLISPPTNFNHLV 

HVGPANGRPGARDKSP 


100 


1450 


A 


1318 


918 


190 


SLCVPGPVDTGTFAVMSVMVGSVTESLAPQA 

LNDSMINETARDAARVQVASTLSVLVGLFQV 

GLGLIHFGFWTYLSEPLVRGYTTAAAVQVF 

VSQLKYVFGLHLSSHSGPLSLIYTVLEVCWKL 

PQSKVGTWTAAVAGWLWVKLLNDKLQQ 

QLPMPIPGELLTLIGATGISYGMGLKHRFEAG\ 

PPVAPNTQLFSKLVGSAFTIAWGFAIAISLGIC 

IFALRHGYRVDSNQVWVMRDV 


101 


1451 


A 


1353 


220 


445 


DWPDLFTYPLIGSPKCFQSARPE\RMYRRTVR 
SSHGNHALQEVLPRSGHGTEFTKQKHLEAAD 
HGHPPARMSIFSR 


102 
lira 


1452 


A 


1363 


542 


2 


AHLLMLNLAL\TDLL\YLTSLPFLIHYYASGEN 

WIFGDFMCKF1RFSFHFNLYSSILFLTCFSIFRY 

CVIIHPMSCFSIHKTRCAWACAVVWnSLVA 

VIPMTFLITSTNRTNRSACLDLTSSDELNTIKW 

miLTA\LLCLPLVIVTLCYTTIIHTLTHGHAN 

\DSCLKQKARRLTILLL 


1UJ 
ifld 


1453 


A 


1371 


2 


410 


CHSTESSSDFILFGDYLLGGLCPLHSGCLQV\C 
SFNEHGYHLFQAMRLAVEEINNSTALLPNITL 
G YQL YDVCSDSANVY ATLRVLSLPGQHH1EL 

QGDLUTV'SPTVLAVIGPDSrNRAATTAALLSP 
FLVPMLLEQ 


IvH 

105 


1/14/1 

1454 


A 


1376 


3 


432 


NSRVEDRS/NMSLWTQNrrVCPVRNVTRDGG 
FGPWSPWQPCEHLDGDNSGSCLCRARSCDSP 
RPRCGGLDCLGPAIHIANCSRNGAWTPWSSW 
ALCSTSCGIGFQVRQRSCSNPAPRHGGRICVG 
KSREERFCNENTPCPVPIF 


1UJ 


1 4<K 

1455 


A 


1379 


2 


356 


GLGLLYLIFAAVEGVMRVIGGSNHLAVVLDD ' 

IILAVIDSIFVWIFISLAQTMKTLRLRKNTVKF 

SLYRHFK^TLIFAVLASrVFMGWTTKTFRlAK 

CQSDWMERWVDDAFWSFLF\SLILIVIMFLW 

RPSA 


106 


1456 


A 


1383 ; 


1 


432 


EDGHGGWSSRCLYDHAEEGHREPWKRLCIW 
VKuuruiiKr Ar ir ruHPLLSPQICLAPETPPRG 
CPPVSSLHFISLQ/RLPRDCQELFQVGERQSGL 
FEIQPQGSPPFLVNCKMTSGTFWTCRTDSRVF 
QNANPSNAAHSEDQPTP 


107 


1457 


A 


1386 


719 


558 


WFVTRSHSVAQABCSOVFTAHRSLDLVGSSN 
YPALSLQSSWDHRHTWLIFAFL 


108 


1458 


A ' 


1397 


61 


2 


RVAISLLCAAIFISFMVQSAGKRWPTGVMLM " 
WVLFAFLYSWPIQALLPTYLKTDLAYNPHT 
VANVLSFSGFGAAVGCX2V/GGFLGDWLGTRK 
AYVCS1XASQLUIPWAIGGANVWVLGLLLF 
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SEQID 
NO: of 
nucl- 
eotide 
seq- 
uence 


SEQID 
NO: of 
peptide 
seq- 
uence 


Met 
hod 


SEQ 

IDNO: 

in 

USSN 
09/496 
914 


Predicted 

beginning 

nucleotide 

location 

correspond! 

ng to first 

amino acid 

rCoIQuC Ol 

peptide 
sequence 


r reQlciea ena 

nucleotide 

location 

luwracnnn n iti D 
CUJI CapuuuJlig 

to last amino 
acid residue 
of peptide 


Amino acid sequence (A^Alanrae OCysteine, 
I>Aspartic Acid, EKHutamic Acid, 
F=Phenylalanine, G=Glycine, H=Histidine, 
l=Isoleucine, K^Lysine, WLeucine, 
M*Methionine, N^Asparagine, P=ProLine, 
Q=Glutamine t R«Arginine, S=Serine, 
T^Threonine, V»Valine, W-Tryptophan, 
Y^Tyrosinc, X=Unknown, *«Stop codon, 
£=possible nucleotide deletion, V=possible 
nucleotide insertion 














Tc^mlc^g^GILpiOLi^yfdtdqraacjLg 
ftynvgalggalapi1galiaqrldlgtalas 
lsfsltfwilrnrrpgkslvr 


109 


1459 


A 


1402 


15 


387 


VLV ALPDT\VTSET VVTEYLGHRVTLPCLY SS 
WSHNSNSMCWGKDQCPYSGCKEALIRTDGM 
RVTSRKSAKYRLQGTIPRGDVSLTILNPSESDS 
GVYCCRIEVPGWF>©VKJNYRLM/QRASTr 


110 


1460 


A 


1421 


3 


350 


HEDLSSLLTRGSGNQERERQLKKLISLRDWM 
L AELAFPVG VL ATCA* SLLSC* YCVILFPCSCF 
FFHSPDALFSLLLLSCYFPS YCFF YYLFFS SSPL 
CLLLASSPFPLFILLASL 


"111 


1441 


A 


1424 


2 


344 


")FTSTMTKPFEKESEQPA*ATLAFGAQTSrrAl> 
QCALKPDLSYLNNSSSSSSTPATSAGGGIFGSS 
TSSSNPPVATFVFGQSSDPVSSYGFVNTAESST 
SDSLLFSQDSKLATTS 


112 


1462 


A 


1434 


46 


372 


TTS^TTSCIRSCT* SGASSGPGWTPRTTW WK 
SRRSSQRTCSRACSGA\VSRTW*RSS*TSSSSC 
STSCSSSSSRSCGRPGGPLGARGVHITSCLNSC 
MSSSTTS STTSTF 


113 


1463 


A 


1439 


3 


292 


HEDIMTHYDRLVDE'ALNAGKQRYEKMISU 
MYLGEIVRNILIDFTKKGFLLRGQISEMLKTR 
GIFLTFLLSNFLIVCVLLFYVSFYLFQSCINFVL 


114 


1464 


A 


1463 


1 


396 


KQQAVPEPHSSTTTPQEQEQNWYGQDLLNLg 

QRTKVHLPGHKTGPAVAKDTPEPVKKEFTVP 

ATSQGP*SPFSEEPPLPPSNEEVPPTLPP*EPQS 

EDP*KNA*LKQMHAATTHWQQHQQHQVGC 

QYHGIMQ 


115 


1465 


A 


1464 


291 


2 


" AGSYPSMVWSCH WGVTQKRRAL* VYSFEEG 
GRRKCGQYWPLEKDSRIRFGFLTVSNLGVEN 
MNHYKKSTLEILNPEVNPGFFFLTLWKQGEN 
NYCN 


116 


1466 


A 


1465 


667 


337 


LPPQRPA*TOSYSTCNVSSGFLAGQSHN1HLQ 
YWTKYQVWEWLQHFLDTNQLDANCIPFQEF 
DINGEHLCSMSLQEFTRAAGTAGQLLYSNLQ 
HLKWNGDSLFLCLSLPC J 


117 


1467 


A 


1479 


1 


JOl 


GTSGGPKRVLVTERFPWQNPLPVNRGQAQR 

VLGPSNSFQRVPLQAQKLVSSHKPGQNQKHK 

QLQATSVPHPVCMPLNNTQKSKQPLPSAPEN 

NPEEELASDPNNEESL*RPWALEDFEIGRPLG 

KGK 


IIS 


1466 


A 


1485 


5 


JO J 


' TYLWL*GNPPFYEKNDGGLFELILRAKDEFN[S 
PYWDDMSDSAKHFIRPLTGRDP*KPFPCDQPL 
QHPWIEGHTCLDNNIHQAASEPINNNFAESKR 
l^AFIATGVVRHKnUCLFMGANLEGPGPTVS 
H 


119 


1469 


A 


1486 


1 


398 


GTTSKHH+LARSURGPFDHDLKPNAATRDQL 

NIIVSYPPTKQLTYEEQDLGWKFRYYLTNQE 

KALTKFLKWVNWDLPQEAKQALELLGKWK 

PMDVKDSLELLSSHYT^^r^VRRYAVARLRQA 

DDEDLLMYL 


120 


1470 


A 


1497 


3 


999 


" ' MGESPAV*GYFVLAGMNSAGLSFGGGAGKY 
LAEWMVHGYPSENVWELDLKRFGALQSSRT 
FLRHRVMEVMPLMYDLKVPHWDFQTGRQL 
RTSPLYDRLDAQGARWMEKHGFERPKYFVP 
PDKDLLALEQSKTFYKPDWFDIVESEVKCCK 
EAVCVIDMSSFTEFEITSTGDQALEVLQYLFS 
NDLDVPVGHTVHTGMLNEGGGYENDCSIARL 
NKRSFFMISFTDQQVHCWAWLKKHMPKDSN 
LLLEDVTWKYTALNLIGPRAVDVLSELSYAP 
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MTPDHFPSIJ<C^MsW^vAM/^ — I 
GEPGFMLYIP1EYRWGFTMLSTLVSNS 


121 


1471 


A 


1498 


3 


306 


AQFLLVOWDHIL*LIVL*TNLTFT riUTTrnnivi" 
WPNSPDVLNHGCFYMQCLSKDCTIGYVSRE 
N^VAHTHTVEEHTGTHLQYVSWPDHSVPDD 
SSDFVEFEN 


122 
123 


1472 


A 


1533 


121 


329 


LGLFSFVWTEVLEEPKDFSCETEDFKTLHCT 

WDPGTDTALGWSKQPSQSYTLFES+VGSGYII 
DNFFLA 




1473 


A 


1547 


111 


408 


DARTTWKPRNGSSGIWPGDGAK*PPAVEQAE~ 
RGHVEMTEKLTFLNLHTSEKDKGGNTALHLA 
AKHGHSPAVQVLLAQWQDINEMNEKQOTPL 
IIVAADRG 


124 
125 


1474 


A 


1555 


1 


745 


MTFDDDDKNTYGVAL VWKKFQTQST.RLSD1 , 

HRKSHLWRGIVSITLIEGRDLKAMDSNGLSDP 

YVKFRLGHQKYKSKIMPKTLNPQWREQFDF 

HLYEERGGV1DITAWDKDAGKRDDFIGRCQV 

DLSALSREQTHKLELQLEEGEGHLVLLVTLT 
ASATVSTSDT PTVnrpttTrETT vdvoot dt 

FHNLKDVGFLQVKVIRAEGLMAADVTGKSD 
PFC\TELNNDRLLTHTVYKNLNPEWNKVFTL 
♦VALVWKKFQTQSLRLSDLHRKSHLWRGIVS 
ITLIEGRDLKAMDSNGLSDPYVKFRLGHQKY 

AWDKDAGKRDDFIGRCQVDLSALSREQTHK 
LELQLEEGEGHLVT 1 VTT TA<5ATWrcnr 

SLEDQKEREEILKRYSPLRIFHNLKDVGFLQV 
KVIRAEGLMAADVTGKSDPFCVVELNNDRLL 
THTVYKNLNPEWNKVFTL 


126 


1475 


A 


1556 


57 


509 


GGPAFNSRYAEP*KNSLAMT*AHADCENYVA 
CGGLDNICSIYNLKTREGNVRVSRELPGHTGY 
LSCCRFLDDSOrVTSSGnTTrAT wnTPTAnnT 

TTFTGHSGDVMSLSLSPDMRTFVSGACDASS 
KLWDIRDGMCRQSFTGHVSDINAVS 




1476 


A 


1592 


3 


178 


KSEKSCVSSLAHFGTSCORDYnAJUVK'T VTTr 
EMLPTCDLADQHN1KFHYAFALNR*ER 


127 
128 


1477 


A 


1612 


1 


497 


T13SPLLVRPYLPYITKSELHA1MTAGFSTIAGS 
VLGAYISFGVPSSHLLTASVMSAPASLAAAKL 
FWPETEKPK1TLKNAMKMESGDSGNLL*AAT 
QGASSSISLVANIAVNLIAFLALLSFMNSAT A 

WVGNMFDYPQLSFELICSYIFMPFSFMMGVE 
WPDSFM 


129 


1478 


A | 


1619 


286 


486 


CCMNSKAQES VFKNVLCNPPALSEMPDVKA ' ' 

EDEVDFRASSISEEVAVGSIAATLKMKQGPM 
TQAJNR 




1479 


A 


1627 


1 


395 


K1KGALRYWDFGKFLCNIWAAVDVRCCTAT1 
MGLCIISIDRYVGVSYPLRYPTJVTQRRGLMA 
LLCVW ALSLVIYIGPLLGWRHPAPEDETICQI 
NEEPGYVLFSTPGSFYLPLAIMLVMN*RVYRV 
AKTE 


130 


1480 


A 


1638 


2 


466 


DPRYRTKIV'NRKTTIYEIQDKTGSMAVVGKG 
ECHNiPCEKGDKLRLFCFRLRKRENMSKLMS 
EMHSFIQIQKNTNQR SHDSRSMAJ JPQEQSQHP 
KPSEASTTLPESHLKTPQMPPTTPSSSSFTKVT 
KDKDIK*LLFNLYSSVEILPEVLHLKT 


131 


1481 


A 


1651 


607 




LAEGGD VFDCVLN GGPIJESRAKALFRQMVi" 
AJRYCHGCGVAHRDLKCENALLQGFNLKLTD 
FGFAKVLPKSHRELSQTFCGSTAYAAPEVLQ 
3IPHDSKKGDVWSMGWLYVMLCASLPFDD 
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TOtPKMLWQ(jOKlGvSFPimSISADCQDLLK 
RLLJEPDMlLRFSlbb V b W Hr W LAb 1 W V 
LSNKVGGESKPKKKK 


132 


1482 


A 


1656 


150 


48 


LVAKSLLYCGCLFFLLQIAKNVGNNSrTsTOIM 
EAKLTSPSPKPTPSSDM*VFLIY*TYFGAWHV 
VDAQ 


133 


1483 


A 


1660 


3 


406 


RKHtKLLIQKLSDVP*ECQNNQL*KLTEICEKE 
KJCEFKIOCNlDDQIU ) EKITEA*SKDKSPMEEEK 
TCMIRSYIQEVGRYIKRLEEAQSKRLEKLREK 
HKEIRQPILDEKPKGEGSSSPLSETa-IEDTSWF 
PNFTP 


134 


1484 


A 


1666 


1276 


466 


PGSTHASARITIY*L*IILSNATEVDNNFSKPPP 

FFPAGAPPASSSSSSSSSSPPTVSTAPPLIPPPGF 

PPPPGAPPPSLIPTIESGHSSGYDSRSARAFPYG 

NVAFPHLPGSAPSWPSLVDTSKQWDYYARSS 

SSSSSSSSSSSSSPRDRDRER*RTRERERERDHS 

PIPSVFNSDEERYRYREYAERGYERHRASRE 

KEERHRERRHREKEETRHKSSRSN SRRRHESE 

EGDSHRRHKHKKSKRSKEGKEAGSEPAPEQE 

STEATPAE 


135 


1485 


A 


1673 


1 


417 


PTRPVNSSQAFALVYYTLGALGGNLIAHMGL 
GYRYWAGIGVLQSCESALTHYRLVANHVAS 
D1SLTGGSWQRIRLPDEVENPGMNSGMLQE 
DLIQYYQFLAEKGDVQAQVGLGQLHLHGGR 
G V* QNHQRAFD YFNL AA 


136 


1486 


A 


1678 


525 


9 


ANTSLSSAAVSAVSPPPCRTSTATTLPPPMPSF 
FCVFPSPSMSPSPSEFLSC1ASVSRVHSLSSSSS 
GSSSTASSLNFSAIMGSSSATASWVLSTASTPP 
CPSALPSSPAQES*SLAASSSAWPVAGISPSGA 
CTFPAGSASGAAKAPSPSWRCPSFRALFSLLD 
SSSLSL 


137 


1487 


A 


1680 


1 


2999 


AHRDEIQRKFDALRNSCTVITDLEEQLNQLTE 

DNAELNNQNFYLSKQLDEASGANDEIVQLRS 

EVDHLRREITEREMQLTSQKQTMEALKTTCT 

MLEEQVMDLEALNDELLEKERQWEAWRSVL 

GDEKSQFECRVRELQRMLDTEKQSRARADQ 

RTTESRQVYELAVKEHKAEILALQQALKEQK 

LKAESl^DKU^lJEKKHAlvlLEMNARSLQQK 

LETERELKQRLLEEQAKLQQQMDLQKNHIFR 

LTQGLQEALDRADLLKTERSDLEYQLENIQV 

LYSHEKVKMEGTISQQTKLIDFLQAKMDQPA 

KKKKVPLQYNELKLALEKEKARCAELEEALQ 

KTRIELRSAREEAAHRKATDHPHPSTPATARQ 

QIAMSAIVRSPEHQPSAMSLLAPPSSRRKESST 

PEEFSRRLKERMHHNIPHRFNVGLNMRATKC 

AVCLDTVHFGRQASKCLECQVMCHPKCSTC 

LPATCGLPAEYVTHOFTEAFCRDKMNSPGLQT 

KEPSSSLHLEGWMKVPRNNKRGQQGWDRK 

YIVLEGSKVLIYDNEAREAGQRPVEEFELCLP 

DGDVblHGAVuAoiiLAiN 1 AkAD vr Y iLKMba 

HPHTTCWPGRTLYLLAPSFPDKQRWVTALES 

WAGGRVSREKAEADAKLLGNSLLKLEGDD 

RLDMNCTLPFSDQWLVGTEEGLYALNVLK 

NSLTHVPG1GAVFQIYIIKDLEKLLMIAGEERA 

LCLVDVKKVKQSLAQSHLPAQPDISPNIFEAV 

KGCHLFGAGKIENGLCICAAMPSKWILRYN 

ENLSKYCIRKEIETSEPCSC1HFTNYS1L1GTNK 

FYEIDMKQYTLEEFLDKNDHSLAPAVFAASS 

NSFPVSIVQVNSAGQREEYLLCFHEFGVFVDS 
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NSLEVIEIQARSSAGTPARAYLDIPNPRYLGPA 
ISSGAIYLASSYQDKLRVICCKC5NLVKESGTE 
HHRGPSTSRR*PASPLPQYQGQRAFLQGRRK 


138 


1488 


A 


1686 


2 


526 


GRPQGPAPGAGSPPESGPGLWAALGCSLVWV 
PLCCLGGAAGRL*ARSGKSGLRRRRAHAGPP 
PGGPCNSCP*CSAPESGGRGPLPGPGTGGVCS 
CWTRGCXJTTARTAAAAAAPGPAGRRPPGGA 
PQNGSCAASASQEAAAPPPMCPPGRRWAVAS 
PPETRCPAAPGTRCRRLEAA 


139 


1489 


A 


1693 


3 


376 


LPSMSNCTSCFRLQSRTES*IRQAGHLLGRNE 
FIETKALGCAWFSLCYYLVLYFESSHKVDFVF 
£V* CFSTFPGAQMTIMSQ ACAERCNIMRL VDR 
RWAGIAKGVGTQKDGRVHLGEQKALGL 


140 


1490 


A 


1704 


3 


376 


ERTNKFIKELIMDGKNLIAATKSLSVAQRKFA 
HSLRDFKFEFIGDAVTDDERCIDASLREFSNFL 
KNLEEQREIMVS*EGCKUSQLSRGKKIWIWK 
LVLVEVVKHLSLGTVVHCNGKMRFPEP 


141 


1491 


A 


1743 


1 


362 


LITNKXWARELSCLD VHLD STGSTA VVADQ 
DKLELELVLKGSYEDTQTSFLGTASAFRFHY 
MAAL*TELSGRLRSSKSNGWNGDNSTGYLTV 
PLRPLTIVKEVTMDWAPNVRGLNWMG 


142 


1492 


A 


1769 


1 


406 


NNPSTLPRGS* PMSPRTTMGRRRQRRREHKSS 
LSLAS STVGPGGQIVHTETTEWLCGDPLSGF 
GLQLQGGIFATBTLSSPPLVCFIEPDSPAERCG 
LLQVGDRVLSINGIATEDGTMEEANQLLRDA 
ALAHKW 


143 


1493 


A 


1789 


1 


447 


QMLRNGGDQNTVPDYHFADRIRELL*PTEDQ 
KNCIP*DTYLRPSALGNIVEEVTHPCSPGPCPA 
NELCEVNRKGCTSGDPCLPYFCVQGCKLGQA 
SDFIARQGTLIQVPSSAGBVECYK1CSCGQSGL 
LENCMENfflCMDLFTDTSALVR 


144 


1494 


A 


1814 


1 


404 


PGRRFRPRLSQAGTDSGS* VFPDSFPSAPAEPL 
PmQEPQDAYIVKNKPVELRCRAFPATQIYF 
KCNGEWVSQNDHVTQEOLDEATGLRVREVH 
IE VSRQQ VEELFGLED Y WCQCVA WS SAGTTK 
SRRAYVRI 


145 


1495 


A | 


1827 


26 


448 


XVEEKHADTWRSXCLSDFFFHAAKXLCXE*N 
CGDAISLSVGDHFGKGNGLTWAEKFQCEGSE 
THLALCPIVQHPEDTCIHSREVGVVCSRYTDV 
RLVNGKSQCDGQVEINVLGHWGSLCDTHWD 
PEDARVLCRQLNCGTAL 


146 


1496 


A 


1828 


574 


333 


QHEGGDLRRRQLGEIQLTVRYVCLRAASAC* 
SMAAET*HHVPASGADPYVRVYLLPERKWA 
CRKKTSVKRKTLEPLFDET 


14/ 


1497 


A 


1855 


1 


372 


ERLVLTSEHCLVLTLFWPSWTYHTLLLSRQH 
VRRJ .PKLTHAEHDHLASIMNKLLTN YDNLFE 
TSVTYSMG*HGAPTGSEAGANWNH**LHAH 
Y YrrLLKMJJl VKJsi'MVGSQMLAQAQRDLTPE 

Q 


148 


1498 


A 


1879 


568 


7 


LLSALDDKGGTQPSASFSNAPTIVCVTACPAG 

IAHTYMAAEYLEKAGRKLGVNVYVEKQGAN 

GIEGR1TADQLNSATACIFAAEVAIKESERFN 

GIPALSVPVAEPIRHAEALMQQALTLKRSDET 

RTVQQDTQPVKSVK-mLKQALLSGISFAVPLI 

VAGGTQVA*AV*RQGISSLHDVQVRTWNS 


149 

- 


1499 


A 


1880 


611 


24 


GLNSENALSNEAMERGWQCLRLFAERLQDlP" 

PSQIRWATATLRLAVNAGDFIAKAQEELGCP 

VQVISGEEBARLIYQGVAHTTGGADQRLVVD 
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IGGASTELVTGTGAQTT*LFSLSMGCVTWLER 
YFADRNLGQENFDAAQKAAREVLRPVADEL 

nVTJfOYI7VT7\7D/"l A C\fTVA A T nPTN^VyfA OfTN/fTYP 
KYHoWJfuDYKvjAo V 1 YV^ALfV^IJiJVlIviAv^UlYU^J} 

RITMEIWPVD 


150 


1500 


A 


1894 


2 


750 


GRVDFFHTDYRPLIRDSNNYVLDEQTQQAPH 

LMPPPFLVDVDGNPHPTKYQRLVPGRENSAD 

EHLBPQLGYVATSDGEVIEQIISLQTNDNDERS 

PESSILDGMIRQLQQQQDQRMGADQDTIPRG 

LSNGEETPRRGFRRLSLDIQSPPNIGLRRSGQV 

EG VRQMHQNAPRSQIATEKULQ A WKKKV V V 

PEVPLGIFRKLEDFRLEKGEEERNLYIIGRKRK 

TLQLSHKSDSVGLVSQSRPRTCRRKYP 


151 


1501 


A 


1900 


141 


785 


GKTIQIQTIMQNKYKWQKQYKTIPKNKKA 

MEMQIKKQFQDTCKVQTKQYKALKNHQLEV 

TPKNEHKTILKTLKDEQTRKLAILAEQYEQSI 

NEMMASQALRLDEAQEAECQALRLQLQQEM 

ELLNAYQSKIKMQTEAQHEKELQKLEQRVSL 

t»t> lift rrvi/iDCt?l A Af rM/TTTi C?T7T> TVXTT I UDHC 

RRAHLEQK 1 EEELAALQKbKb JbKlj^JLLJSKl^b 
REIETFDMESLRMGFGNLVTLDFPKEDYR 


152 


1502 


A 


1915 


2 


377 


LVRLLDTQRDGLQNYEALLGLTNLSGRSDKL 
RQKIFKERALPDIENYMFENHDQLRQAATEC 
MCNMVLHKBYQERr^ VLL-LAj 
EDDDKVQNAAAGALAMLTAAHKKLCLKMT 
QVTT 


153 


1503 


A 


1921 


1 


237 


AYQSLRLEYLQ1PPVSRAYTTACVLTSAAVQL 
ELITPFQLYFIPELIrKrlr Qi WKL1 1 N r JLrr Virr U 
FNFLLYMIFLYT 


154 


1504 


A 


1928 


2 


354 


EMVEGGEGKMCINTEWGGFGDNGCIDDIRTR 
YDTEVDbGbLNrGK^K YiilUNl 1 oUM YJArcl V 
RQDLIDLTKQGLLFRGQISERLRTRGIFETKFLS 
QIESDRLALLQVRR1LQQLGLD 


155 


1505 


A 


1929 


2 


369 


TEIAKIKMEAKJ'vjvYbKliL 1 Mr l^INDriilvA^l^A 
KSEALVLREKSTLER1HKHQEIETKEIYAQRQ 
LLLKDMDLLRGREAELKQRVEAEESYQLELK 
DDYIIRTYRLIEDDRINIQISGHWQESP 


156 


1506 


A 


1935 


1 


270 


VTRKLPIITVT»AFrARAFRGSPAADCLLENEL 
RSCFGLrWFTTTTDLQILTSSILPSIL 


157 


1507 


A 


1936 


584 


305 


ESKVNl^KFRTKSPFCPAESPQSATKQLDQPTA 
AYEYYDAGNHWCKDCNTICGTMFDFFTHMH 
NKKHTQGQFQKSSDFQKEELQQTFLPPERQG 


158 


1508 


A 


1939 


1 


423 


1 1 HRLN V rAJErrC 1 bMri Y WMrU V rriKC 1 1 A 
NTCPVDLTDYCAQNGFYCLVYGFLPYGSLED 
RLHCQTQACPPLSWPQRLDILLGTARA1QFLH 
QDSPSLIHGDIKSSNVLLDERLTPKLGDFGLA 
RFSRFAGSSPIQSSM 


159 


1509 


A 


1974 


3 


401 


HTSTARIXLHRGAuKiiA V 1 oDvj Y 1 Ai-JrlLAAK 

NGHLATVKLLVEEKADVLARGPLNQTALHL 

AAAHGHSEVVEELVSADVIDLFDEQGLSALH 

LAAOGRHAOTVETLLRHGAHINLOSLKFOGG 

HGPAATLLR 


160 


1510 


A 


1982 


2 


417 


KFLKDLEKQYNKEEPHLSEIGSCFLQNQEGFA 
I YSEYCNNHPGACT PXANI MKQGKYRHFFEA 
CRLLQQMIDIAIDGFLLTPVQKICKYPLQLAEL 
LKYTTQEHGDYSNiKAAY^AMKKVACLINER 
KRKLESIDKIA 


161 


1511 


A 


1984 


4 


770 


RETGSVSLSPSGLEGAESYAVSPILYSSPDVKE 
LWLETLQGQRHSHTGVKSTPGQSAAILMKLR 
SSHNASKTLNANNMETLIECQSEGDIKEHPLL 
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A^^SEOSICQLJEVKKRKKVLS WPFLMRRLS^ 

PASDFSGALBTDLKASLFDQPLSIICGDSDTLP 

RPIQDILTIIXJLKGPSTEGIFRRAANEKARKEL 

KEELNSGDAVDLERLPVHLLAVVFKDFLRSIP 

RKLLSSDLFEEWMGALEMQDEEDRIEALK 


162 


1512 


A 


1986 


864 


501 


LLNSGLFSAPDGSNLEMRLTRGGNMCSGR1EI 
KFQGRWGTVCDDNFNIDHASV1CRQLECGSA 
VSFSGSSNFGEGSGPIWFDDUCNGNESALWN 
CKHQGWGKHNCDHAEDAGVICSSKD 


163 


1513 


A 


2001 


419 


187 


AVDLSIDESSLTGETTPCSKVTAPQPAATNGD" 

LASRSN1AFMGTLVRCGKAKGWIGTGENSE 

FGDHNLSTFWHS 


164 


1514 


A 


2012 


284 


597 


SLLCLFPGTSIVVCKPIVIETQLYVIVAQLFGG 
SHIYKRDSFANKHKIQAIEILKIRKPNDIETFKI 
ENNWYFVVADSSKAGFTTIYKWERETGFYSH 
QSFTR 


165 


1515 


A 


2013 


2 


403 


EDPEELGHFYDYPMALFSTFELFLTIIDGPANY 
NVDU^MYSirYAAFAnAlXLMLNIXIAMMG 
DTHWRVAHERDELWRAQIVATTVMLERKLP 
RCLWPRSGICGREYOLGDRWTI RVPDRnnr w 
RQRIQRYA 


166 


1516 


A 


2019 


2 


927 


CCQREGLGLKAWQULSHGRNGLPGEPASS 

QGLSAASSTPVFHLALQIDSAPDNIDWVEMLF 

NKNMVTERLQNVMVLEQCFSDSSSLYRFLTY 

SYLLAFNVWLLLAPVTLCYDWQVGSIPLVETJ 

WDMRJ^TIFLAVVMALLSLHCLAAFKRLE 

HKEVLVGLLFLVFPFIPA SNLFFR VHFW a ttr 

VLYMPSMGYCILFVHGLSKLCTWLNRCGATT 
LIVSTVLLLLLFSWKTVKQNEIWLSRESLFRS 
GVQTLPHNAKVHYNYANFLKDQGRNKEAIY 
HYRTALNNNKAWDYLCWRFRKTLTDLP 


167 


1517 


A 


2025 


"696 


71 


AAASAASSLTVTLGRLASACSHSILRPSGPGA 

ASLWSASRRFNSQSTSYLPGYVPKTSLSSPPW 

PEVVLPDPVEETRHHAEVVKKVNEMIVTGQY 

GRLFAWHFASRQWKVTSEDLILIGNELDLA 

CGERIRLEKVLLVGADNFTLLGKPLLGKDLV 

RVEATVIEKTESWPRIIMRFRKRKNFKKKRIV 

TTPQTVLRINSIEIAPCLL 


168 


1518 


A 


2046 


2 


366 


HLQVAAkVFMPLQAVDSAt>kPLKG0AQAP<5 " 
RLQGAARVFMPLQAQVKAKASKPLQMQKA 
PPRLRRAARVLMPLQAQVRAPRLLQVQSQVS 
KKQQAQTQTSEPQDLDQVPEEFQGQDQVLR 


169 


1519 


A 


2049 


1 


945 


QNLEDREVLNGVQTELLTSPRTKDTLSDMTR 

TVEISGEGGPLGIHWPFFSSLSGR1LGLFIRGI 

EDNSRSKREGLFHENECIVKINNVDLVDKTFA 

QAQDVFRQAMKSPSVLLHVLPPQNREQYEKS 

VIGSLNIFGNNDGVLKTKVPPPVHGKSGLKTA 

NLTGTDSPETDASASLQQNKSPRVPRLGGKPS 

SPSLSPLMGFGSNKNAKKJKIDLKKGPEGLGF 

TWTRDSSIHGPGPIFVKNILPKGAAIKDGRLQ 

SGDRILEVMGRDVTGRTQEELVAMLRSTKQG 

ETASLVIARQEGHFLPRELVMFRSQSH 


170 


1520 


A 


2050 


363 


1 


PVATHLTKILNSDEHAVVISSAKTLCETVKDF "' 
VAKVEKTYDKTLENAVVADAVASKCSVLNE 
KLEQLLQALHTDSQAAPVLPGLSPLIVEEDAV 
ESSSEESLGESKEQLGDDVTKPSSOKA 


171 


1521 


A 


2055 


139 


675 


IPSRPWLGRITGLDPAGPLFNGKPHQDRLDPS 
DAQFVDV1HSDTDALGYKEPLGNIDFYPNGG 
LDQPGCPKTILGGFQ YFKCDHQRS VYL YLS SL 
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nucleotide insertion 














REgcYfrAVPCDSVObVftNGliCvSCGTSQkE 
SCPLLGYYADNWKDHLRGKDPPMTKAFFDT 

AjDiiOi rV//Vl ttlir VUlli VvfNrfcJNVrv 


172 


1522 


A 


2056 


3 


361' 


UQHKSAVEYAQSHLSLVSMCKESHKCSEPK 
MEWKVKmSDGTRYITKRPVRBRILKERALKI 
. KEERSGLTTDDDTMSEMKMGRY W SKEERKQ 


173 


1523 


A 


2060 


1 


387 


GTRILSMQIPFYGFQPIRTSEHMAAAGVFALL 
QAYAFLQYLRDRLTKQEFQTLFFLGVSLAAG 
AVFLSVIYLTYTGYIAPWSGRFYSLWDTGYA 
KUili'UAbVatSrli^rl 1 WVofrrlJLiiUAjClrrA 

G 


174 


1524 


A 


2071 


74 


443 


LLMGPKAKKSGSKKXICVTKAERLKIXQEEEE 
RRLKJSEEEARLKYEKEEMERLEIQRIEKEKW 
HRLEAKDLERRNEELEELYLLERCFPEAEKLK 
QETKLLSQ WKHYIQCDG SPDPS VAQEMNT 


175 


1525 


A 


2083 


139 


486 


AALTWSQPQEFWPMEMQPIVTDMVTVHWV 
AESSTVGWLCALFRVTHYGVGATGHGYVCG 
KRVLCGLPLPSPAPMPLMSLPEGESRKEREVQ 
RLQFPYLEPGHELPATTLLAFLAAV 


176 


1526 


A 


2092 


3 


587 


EGS\'NFKFGVLFAKDGQLTDDEMFSNEIGSEP 

FQKFLNLLGDTITLKGWTGYRGGLDTKNDTT 

GiHSVYJ V Y ^(jHIiIMJ 4 H Vol MLr Y bKfaMKl^y 

VERKRHIGNDIVTIVFQEGEESSPAFKPSM1RS 

HFTHIFALVRYNQONDNYRLKIFSEESVPLFG 

PPLPTPPVFTOHQEFRDFIXVKLINGEKATLET 

PCI 


177 


1527 


— A 

A 


2103 


44 


427 


CDGAWLAWACWVFGNDFPSPASAACSALLG 
CSVSTACLC^PLCSGSPLAPFRRTAALQEGLR 
RAVSVPLTLAETVASLWPALQELARCGNLAC 


178 


1528 


A 


2104 


2 


409 


ALQSTLGAVWLGLLLNSLWKVAESKDQVFQ 
PSTAASSEGAVVEIFCNHSVSNAYNFFWYLHF 
PGCAPRIXVKGSKPSQQGRYNMTYERFSSSL 

I TT fWTDI? A n A A WVP A VPVPKfTHlf I IPHTOT 
J_»lLV<VKilAL/AAV I I W\VX!iYrT\ I UJSX-lrU 1 U 1 

RLQVFPNIQNPD 


179 


1529 


A 


2111 


1 


312 


PTRSSTRPPSLFVHASAKGGEKEEGDDGHYL 
MRTESHTGLKKGGNAl^WMLKRNTEPKKG 
SYHFDLERLRAAHELFEREQEHLAPGGISMPL 
PPPLPUPACLG 


180 


1530 


A 


2116 


3 


366 


TSIKRAjETTDVTRSFGXVDSSEAWQQHDVQE 
LCRYMFD ALEQKWKQTEQ ADL1NEL Y QGKL 
KDYVRSLECGYEGWRIDTYLDIPLVIRPYGSS 
QAFASWCTFHLTACVSLHRIHNSTW 


181 


1531 


A 


2117 


2 


386. 


YGLGAHFGR3LFIQAGINENDFYDGAWCAGR 

xmi hnurrc\/n a t>tj f TTDT7m\/TXon.DXTCT \xn c 
NULQQWIbVlJAKK-LlKr Id VI 1 I^LtKinoI..VVJLo 

DWVTS YKVMVSND SHTW VTGKNGSGDMIFE 
GNSEKEIPVLNELPVPMVARY1RINPQSWFDN 
GSICI 


182 


1532 


A 


2123 


1 


493 


RTKTDVYILNLAVADLLLLFTLPFWAVNAVH 
GWVLGKIMCKrrSALYTLNFySGMQFLAaSI 

dryvavtkvpsqsgvgkpcwucfcvwmaai 
llsipqlvfvtvndnarcipifprylgtsmkal 
iqmleicigfvvpflimgvcyfitartlmkmp 
nikis 


183 


1533 


A 


2140 


3 


561 


RQAWHEAFKVRKEILTVICCLLAFCIGLIFVQ 
RSGNYF\nTMFDDYSATLPLLIVVILENIAVCF 
WGIDKiMEDLKX»MUJFAPSRYYYYMWKYI 
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SPLMLL$r j r iA ^WMMaj^QPPnvMA wrcnv a oH 

EEFLSYPTWGLAVCASLDVFAILPVPVAFIGR 

RFSUDDGAGPFCSAAYTTTGCRTPYL 


184 


1534 


A 


2145 


3 


538 


HELTVAAADRGQPPQSSVYPVTVTVLDVNI) 

NPPVFITIASYRVTVPEDTPVGAELLHVEASD 

ADPGPHGLVRFTVSSGDPSGLFELDESSGTLR 

LAHALDCETQAKHQLVVQAADPAGAHFALA 

PVTIEVQDVNDHGPAFPLNLLSTSVAENQPPG 

TLVTTLHAIBGDAGAFGRLRYHL 


185 


1535 


A 


2151 


2 


671 


LDKXLDRMENYMFNEYILKQVAATYIKLGW 

PKNNFNGSLVQASYQHEELRREVIMLACSFG 

NKHCHQQASTUSDWISSNRNRIPLNVRDIVY 

CTGYSLLDEDVWEFIWMKFHSTTAVSEKKIL 

LEALTCSDDRNLLNRLLNLSLNSEWLDQDAI 

D VIIHVARNPHGRDL A WKFFRDK WKILNTRI 

RQKTLEFDFAEPLILAFPnLYTAIDNPPLVREH 

E 


t86 


1536 


A 


2153 


2 


400 


GPMCDKHSAFAEKFHAGF1D YIVHPL WETWA 
HLALPDAQDILYTLEDNRNWVDSMIPQSPSPP 
LDEQNRDWQGLLENLHVELTLDEEDSEGPEK 
bOEGQTYFrSSKTLCGIVPQNTDSLGETGlHIC 
AHDKSP 


187 


1537 


A 


2158 


227 


442 


FNCFRVASDSFLENSSLLIMILPLRNATQEFBR 
PGAVAYTCNPSTLGGWGGWITRSGVRDQPG 
QI1GGTPS 


188 


1538 


A 


2167 


a 
j 


aha 


AHLGGAWLTQRSLGSWAAPGPARAAKEWA 

CIPQNQKMNIWRMKTSKHLQLLSFVLGAVSP 

AVVVPYMMVLQENGYGVEEGIPTLLMAASS 

MD0ILA1TGFNTCLS1YFSSGCARSSGSRNSKS 

LK I rLO i ICJiuCDDSSIFSHLDHSSK W S STY G 

HSGA 


189 


1539 


A 


2168 


2 


412 


EFLSSNQ1TQLPKTTFRPMPNLRSVDLSYNKL 
Q.ALAPDLFHGLRKLTTLHMRANAIQFVPVRIF 
QDCRSLKFLDIGYNQLKSLARNSFAGLFKLTE 
LHLEHNDLVfCVNFAHFPRLISLHSLCLRRNKV 
AJ WSS I/D W 


190 


1540 


A 


2179 


64 


399 


MRLNQOTLLLESFGXXRPYTSEHAPTYHQW " 
MAAUrVLLKW 1 ioErLrLEHEYAMQRTWLED 
AYECTFIVLDAEKRHAQPGATEESCMVGDVN 
LFLTDLEDLTLGEIEVLIAEP 


191 


1541 


A 


2190 


1 


469 


CLDRAAGIRHERNVryiNETHTRHRGWLARR" 
LSYVLFIQERDVHKGMFATNVTENVLNSSRV 

AKRILQEMVATVSPAMIRLTGWVLLKLFNSF 
FWNIQIHKGQLEMVKAATETNLPLLFLPVHR 
SH 


192 


1542 


A 


2197 


26 


117 


PCVYnnrDT T T TATA! Ventre «u r>T /-iy-st rtn 

rc&AUUlKLLLl U l^LYOKruoAlAPLGDLDR 
DGYNGEGREEPY 


193 


1543 


A 


2236 


2 


383 


EYFPNSIWSU'STMDLGDIGFYTYRILQALS 
YTHSKGIMHRDVKPLNILCNSPRNKVILADW 
GLAEFYHPMRKYSVHVATRYYKSPEILLDYE 
YYDYSLDIWAVGVBLLELLTLKLHVFEGGDN 
EQ 


194 


1544 


A 


2241 


105 


409 


RKGVGKMPTSEGRPGQERSDWVTSYKVMGS 
NDSHTWVTVKNGSGDMIFEGNSEKEIPVLNE 
LPVPMGAR YIRINPQ S WFDNGSICMRMEELG C 
PLPDPNNY 


195 


1545 


A 


2245 


1 


672 


MGVASDWTKRIEYQPGSGSMPLFPSIHLETCD 
GAVSSLQIVmQTNYIGKGCDRETYSEKSLQ 
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jkLCQASSGHDLLW^SAAlNWTAGLLVDSSE 

MIFKJDGRQGAK1PDGIVPKNLTDQFTTTMW 

MKHGPSPGVRAEKETIIXIYSDKTEMNRHHY 

ALYVHNCRLVFLLRKDFDQADTFRPAEFHW 

KJLDQQALAKVDGQPGKSITRQLQEMPVTIQG 

ISLKPS 


196 


1546 


A 


2256 


1 


396 


FRGTPVSGLTNIU>TLAVnUIFREPIRLKTVKP 
GKVINKDLRHYLSLQFQKGSIDHKLQQVIRD 
NLYLRTIPCTTRAPRDGEVPGVDYNFISVEQF 
KALEESGALLESGTYDGNFYGTPKPPAEPSPF 
QPDPV 




1547 


A 


2259 


43 


5<W 


QLAIEIGVRALLFGVFVFTEFLDPFQRVIQPEEI 

Wl YKNP1 /lOSDNIPTRLMFAISFLTPLAVICV 

VKIIRRTDKTEIKEAFLAVSLALALNGVCTNTI 

KLIVGRPRPDFFreCFPIXJVMNSEMHCTGDP 

DLVSEGRKSFPSIHSSFAFSGLGFTTFYLAGKL 

HCFTESGRGKSWRLCAAILPL 


198 


1548 


A 


2275 


3 


404 


TCTTVVVIPRMLVDFLSESKTISLPECATQMFF 

FLGFASNNCFIMAAMSYDRYTAIHNPLQYHT 

LMTOKICLQMMMASWMVGFLFSLCIIVrVFN 

LSIX^DLNTIQHYFCDISPVVSLACNYTFYHEM 

AIFVLSA 


199 


1549 


A 


2315 


1 


375 


LTQMFFIHALSAIESTILLAMAFDRYVAICHPL 
RHAAVLNNTVTAQIGIVAVVRGSLFFFPLPLLI 
KRLAFCHSNVLSHSYCVHQDVMKLAYADTL 
PNWYGLTAILLVMGXDRMFISLSYFLII 


200 


1550 


A 


2334 


2 


409 


FRVRPQQRKMSFFFKTELGEKLVTKFLFETDF 
SDDPMLPSPDQLKKKAPFTNKKLKAHQTPVD 
ILKQKAHQLASMQVQAYNGGNANPRPANNE 
EEEDEEDEYDYDYESLSDDNILEDRPENKSCH 
DQLQFEYKEEM 


201 


1551 


A 


2350 


3 


512 


ISWEAQIAE1IQWVSDEKDARGYLQALASKM 
TEELEALRSSSLGSRTLDPLWKVRRSQKLDM 
SARLELQSALEAEIRAKQLVQEELRKVKDAN 
LTLESKLKDSEAKNRELLEEMEILKKKMEEK 
FRADTGKLMLCDSALFEYKYFSNECFYFLFD 
LIVTLEAPTEFQIQY 


202 


1552 


A 


2351 


1 


1003 


PSSYSSDELSPGEPLTSFPWAPLGAPERPEHLL 

NRVLERLAGGATRDSAASDILLDDIVLTHSLF 

LPTEKFLQELHQYFVRAGGMEGPEGLGRKQA . 

CLAMLLHFLDTYQGLLQEEEGAGHUKDLYL 

LIMKDESLYQGLREDTLRLHQLVETVELKIPE 

ENQPPSKQVKPLFRHFRR1DSCLQTRVAFRGS 

DEIFCRVYMPDHSYVITRSRLSASVQDILGSV 

TEKLQYSEEPAGREDSLILVAVSSSGEKVLLQ 

PTEDCVFTALGINSHLF ACTRD S YE AL VPLPE 

EIQVSPGDTEIHRVEPEDVANHLTAFHWELFR 

CVHELEFVDYVFHGE 


203 


1553 


A 


2361 


2 


403 


NNLNCAEPLFEQNNSLNVNFNTQKKTVWLIH 

GYRPVGSIPLWLQNFVRILLNEEDMNVIVVD 

WSRGATTFIYNRAVKNTRKVAVSLSVHIKNL 

LKHGASLDNFHFIGGSLGAHISGFVGK1FHGQ 

LGRITGLDP 


204 


1554 


A 


2390 


280 


476 


SPSLLPQCLMSLSDLSLSPAPPSHLSPRCPSPQ 

AGSRLGAMRRCAREMDATPMPPAPSCPSERV 

T 


205 


1555 


A 


2400 


543 


745 


AAVALRDISWQQPYPMDFYAGSSLGPWTVN 

HGQDRRPHAPGRPARGKVQEGSARPPSAVAC 

EDCSCR 
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"206 


1556 


A 


2406 


122 


485 


DLSPDSREDHPQGHRRLLPKRPVRGSLMPQH 
THHPCPVSSTTNDTTDQIWVSVGSLRMGTGG 
MGANASTSPRCWDLSSGKKKWIIQVPILASIV 
ESRGGLLATGVGGMCACVPRNQPLTGT 


207 


1557 


A 


2409 


289 


418 


LWTLYRHKQQVQHNHSNRLSCRPSQEDRAT " 
HTIMVLDKENTLS 




1558 


A 


2413 


64 


492 


VQGTGXXFIAr»"l'EAMTHFPASPVWAGMFFL 

MLINLGLGSMIGTMAGITTPIIDTFKVPKEMFT 

GGCCVFAFLVGLLFVQRSGNYFVTMFDDYSA 

TLPLTLrVILENIAVAWrV'GTKKFMQELTEML 

GFRPYRFYFYMWKFVSP 


209 


1559 


A 


2417 


3 


877 


BKERLLDE WFTLDEVPKGKLHLRLE WLTLMP " 
NASNLDKVLTDDCADKDQANDGLSSALLILY 
LDSARNLPIRYKTNEPVTOENFTFFIHNPKRQ 
DLEVEYRDEQHQCPLGNLKVPLSQLLTSQDM 
TVSORFOLGNSGPNSTIKMKTAT RV1 Hi fkrf 

RPPDHQHSAQVKRPSVSKEGRKTSIKSHMSG 
SPGPGGSNTAPSTPVIGGSDKPGMEEKAQPPE 
AGPQGLHDLGRSSSSLLASPGHISYKEPTPSIA 
SDISLPIATQELRQRLRQLENGTTLGQSPLGQI 
QLT1P 


210 


1560 


A 


2422 


35 


456 


REFAASDLEPFIPTDQPISPEA1TQPSCIKRQRA 
AGNPGSLAAHDHKPCSAPLEPKIQASRNQRW 
GAVRAAESLTDfAEPASPQVHETPIDASQTQK 
VEPASKSRFTPELOAK VSH^R FTlAT enon a tp 
HHAQPQRGEG 


211 


1561 


A 


2431 


1 


764 


RRYSQKLIQHTACQLLRTYPAATRIDSSNPNP 

LMFWLHGJQLVALNYQTDDLPLHLNAAMFE 

ANGGCGYVLKPPVLWDBCNCPMYQKFSPLER 

DIJ)SMDPAVYSLTIVSGONVCPq , N < 3Mn«m n, TP 

VDVLGMPLDSCI IFRTKPII IRNTLNPMWNEQF 

LFHVHFBDLVFLRFAVVENNSSAVTAQRIJPL 

KALKRGYRHLQLRNLHNEVLEISSLFINSRRM 

EENSSGNTMSASSMFNTEERKCLQTHRVTVH 

GVPG 


212 


1562 


A 


2436 


1 


411 


G1RGTTGHLGCP1NDDPSLTLTVSWVMEDKPI 
YIGNGTKKEDDSLTIFAVAKRDHVSDTCOAC 
TDIJ3HmDKGYLTVLGEQATFrNRLGALPKG 
RANRTRDLELTYLAERTVRLTWIPGDANNRPI 
TDYDCQJEEHQ 


213 


1563 


A 


2445 


1 


1294 


MSSIGCLWVSRSSQ1DGLTAEKSGPEKPHGT 

WLMPELHPKEQILELLVLEQFLSILPEELQIWV 

QQH^ESGEESVTLLEDLEREFDDPGQQVPAS 

PQGPAVPWKDLTCLRASQESTDIHLQPLKTQ 

LKSWKPCLSPKSDCENSETATKEGISEEKSQG 

LPQEPSFRGISEHESNLVWKQGSATGEKLRSP 

SQGGSFSQVIFTNKSLGKRDLYDEAERCLILT 

TDSIMOQKVPPEERPYRCDVCGHSFKQHSSLT 

QHQRIHTGEKPYKCNQCGKAFSLRSYLIIHQR 

IHSGEKAYECSECGKAFNQSSALIRHRKIHTG 

EKACKCNECGKAFSQSSYLHHQRIHTGEKPY 

ECNECGKTFSQSSKLIRHQRIHTGERPYECNE 

CGKAFRQSSELITHQRIHSGEKPYECSECGKA 

FSLSSNLIRHQRIHSG 


214 


1564 


A 


2461 




615 


GIPGSTISSSRNIFLEDDLAWQSLIHPDSSNTPL 
STRLVSVQEDAGKSPARNRSASITNLSLDRSG 
SPMVPSYETSVSPQANRTYVRTETTEDERKIL 
LDSVQLKDLWKKICHHSSGMEFQDHRYWLR 
THPNCIYGKELVNWLIRNGHIATRAQA1AIGQ 
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AMVDGRWtbCVSHHDQLFRDEYALYRPLQV 
LFSVYCQLECSKL1L 


215 


1565 


A 


2464 


3 


2932 


GPGVRSSQDGMADVFVHLRTAWPRCSFISGQ 

HGPGKHGRRYCSSQDSMADVFVHLRTAWPT 

CSUSGQHGPGESVSYEDDDIPAPASLLHVNA 

AAPALTNI^APVLCTAPNNTAQKEKVPSGMR 

QRPAGVRISSRTPDLTCAVSTHSTVPGVRISSC 

TPDLTCAVSIHSTVPSVCISSCTPDLTCAVSTH 

STVPGVRISSCTPDLTCAVSTHSTVPGVRISSR 

TPDLTCAVSIHATVPGVRISSCTPDLTCAVSIH 

ATVPGVRISSCTPDLTCAVSTHSTVPGVRISSR 

TPDLTCAVSfflSTVPGVRISSCTPDLTCAVSIH 

ATVPGVRISSCTPDLTCAVSTHSTVPGVRISSR 

TPDLTCAVSIHATVPGVRISSRTPDLTCAVSIH 

ATVPGVRISSCTPDLTCAVSIHATVPGVR1SSC 

TPDLTCAVSIHATVPGVRISSRTPDLTCAVSIH 

ATWGVRISSCTPDLTCAVSTHSTVPGVRISSR 

TPDLTCAVSIHATVPGVRISSCTPDLTCAVSTH 

STVPGVRISSRTPDLTCAVSIHATVPGVHISSC 

TPDLTCAVSTHSTVPGVRISSRTPDLTCAVSIH 

STVPGVCISSRTPDLTCAVSIHSTVPSVHISSCT 

PDLTCAVSIHSTVPGVRISSRTPDLTCAVSTHS 

TWGVHISSCTTDLTCAVSIHATVPGVHISSCT 

PDLTCAVSTHTTVPGVRISSRTPDLTCAVSIHS 

TVPGVRISSCTPDLTCAVSTHSTVPGVRISSRT 

PDLTCAVSTHLTVPGVRISSRTPDLTCAVSIHA 

TVPGVHISSCTPDLTCAVSIHATVPGVR1SSRT 

PDLTCAVSIHATVPGVHISSCTPDLTCAVSTHS 

TVPGVRISSRTPDLTTAVSIHSTVPGVHISSCT 

PDLTCAVSTHSTVPGVHISSCTPDLTCAVSTH 

STVPGVHISSRTPDLTCAVSIHATVPSVHISSC 

TPDLTCAVSIHSTVPGLLTSVSQTSTG 






A 


2477 


1 


414 


FRTKSYRKGSYRCIVSEWIAEQGNWQE1QEK 

AVEVATVVIQPTVLRAAVPKNVSVAEGKELD 

LTCNTTTDRADDVRPEVTWSFSRMPDSTLPGS 

RVLARLDRDFLVHSSPHVALSHVDARSYHLL 

VRDVSKENSGYYY 


217 


1567 


A 


2480 


2 


460 


CRTLCEGPQRFEEYEYLGYKAGLYEAIADHY 
MQVLVCQHECVRELATRPGRLSPIENFLPLHY 
DYLQFAYYRVGEYVKALECAKAYLLCHPDD 
EDVLDNVDYYESLLDDSIDPASIEAREDLTMF 
VKRHKLESELIKSAAEGLGXSYTEPNYW 


218 


1568 


A 


2483 


140 


383 


AFSSPHPSPAPQFPECGFYGLYDKILLFKHDPT 
SANLLQLVRSSGDIQEGDLVEYVLSASATFED 
LQIRPHALTVHSYRAP 


219 


156*9 


A 


24*9 


3 


428 


ssrlvUlagaAalasgsqgdrepvyrdcvxq 
ceeqncsggalnhfrsrqpiymslagwtcrd 
dckyecmwvtvglylqeghkvpqfhgkwp 
fsrflffqepasavasflnglaslvmlcryrt 
fvpasspmyhtcvafawvs 


220 


1570 


A 


2498 


1 


1297 


MDGEAVRFCTDNQCVSLHPQEVDSVAMAPA 

APKIPRLVQATPAFMAVTLVFSLVTLFVVDH 

HHFGREAEMRELIQTFKGHMENSSAWVVEIO 

MIJCCRVDNVNSQLQVLGDHLGNTNADIQMV 

KGVLKDATTLSLQTXJMLRSSLEGTNAEIQRL 

KEDLEKADALTFQTLNFLKSSLENTSIELHVL 

SRGLENANSEIQMLNASLETANTQ AQLAN SS 

LKNANAEIYVUIGH1JDSWDLRTQNQVLRNS 

LEGANAEIQGLKENLQNTKALNSQTQAFIKSS 
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UTYMTC A DTOUT D/VLTt CD A/"»TM5n±m vnr\r is\ r — 

vtaqtqkangi£dqto 
tlnaqiqvlnghmknasreiqtlkqomkna 
saltsqtqmldsnlqkasaeiqrlrgdlent 
kaltn^qqeqsrlktlhvvitsqeqlqrtq 


221 


1571 


A 




j 




K. VKLNN DOLbPLMMAAKTGKIGIFQHnRKE V 
TDEDTRHLSRKFKDWAYGPVYSSLYDLSSLD 
TCOEEASVLEILVYNSKIENRHEMLAVEPINE 
LLRDKWRKFGAVSFYINWSYLCAMVIFTLT 
AYYQPLEGTPPYPYRTTVDYLRLAGEVITLFT 
KjyLrrr l N 


222 


1572 


A 


2508 


3 


395 


dahcqrklamqefmeinerltelhtqkqkl 
arhvrdkeeevdlvmqkveslrqelrrter 
akkelevhtealaaeaskdrklreqsehysk 
qleneleglkqkqisyspgvcsiehqqehkl 

V1T\I 


223 


1573 


A 


2544 


2 


412 


NDPAIISNFSAAWHTIVNElLliSMTSLEVTK 
MVDERTD YLTKSLKEKTPPF SHCDQ AVLQ C S 
EASSNKDMFADRLSKSIDCHSIDKSKSVIPNE) 
KNAVYKESLPVSGEESQLTPEKSPKFPDSQNQ 
LTHCSLSAA 


224 


1574 


A 


2552 


401 


1 


GASLCFISTAFTVLTFLIDSCRFSYPERPnFLSM 
CYNIYS1AYIVRLTVGRERISCDFEEAAEPVLI 
QEGLKNTGCAHFLLMYFFGMASSIWWVILTL 
TWFLAAGLKWGHEAIEMHSSYFHIAAWAIPA 
VK , 


225 


1575 


A 


2563 


724 


3 


MSARKERREKGEEEGEGEKDGDEDEKEEEKE 

GLGEEEEKEAGKKKKKQEEKEKEKGAVYSR 

VARICKNDMGGSQRVLEKHWTSFLKARLNC 

SVPGDSFFYFDVLQSITDIIQINGIPTVVGVFTT 

QLNSIPGSAVCAFSMDDIEKVFKGRFKEQKTP 

DSVWTAVPEDKVPKPRPGCCAKHGLAEAYK 

TSIDFPDETLSFIKSHPLMDSAVPPIADEPWFT 

KTRVRYRLTAISVDHSAGPYH 


226 


1576 


A 


2571 


449 


3 


EGVLFVYONYVGDVMNFEMAAEMAQEVAIP 

TRTVLTTDDISSSPiEDRDGRRGVAGNFFIFKV 

AGAACDRGMSLEACEAVTRKANRRTYTMG 

VALEPCSLPQTRRHNFEIGAEEMEIGMG1HGE 

RGVIREKMMPADAIVDHIMDR1FS 


227 


1577 


A 


2575 


3 


1197 


VLSDLCLFYYRDEKEEGILGSILLPSFQtALLTS 

EDHINRKYAFKAAHPNMRTYYFCTDTGKEM 

fcL W MKAMLDAAL VQTEPVKRVDKlTSENAP 

TKETNNIPNHRVLIKPEIQNNQKNKEMSKIEE 

KKALEAEKYGFQKDGQDRPLTBONSVKLNSL 

PSEYESGSACPAQTVHYRPINUSSENKTVNVS 

LAULrUuMRtM lOt'LY lEADKYIQRTNSMQQ 

LEQWIKIQKGRGHEEETRGVISYQTLPRNMPS 

HRAQIMARYPEGYRTLPRNSKTRPESICSVTP 

STHDKTLGPGAEEKRRSMRDDTMWQLYEW 

QQRQFYNKQSTLPRHSTLSSPKTMVNISDQT 

MHSIPTSPSHGSIAAYQGYSPQRTYRSEVSSPI 

QRGDVTIDRRHRAHHPKVK 


228 


1578 


A 


2583 


3 


330 


LPFLGLGSVLPQGMVMASPEMNPTICSVFEA 
HIVLLFHATTFRRGFQVTVLVGNVRQTAWE 
KIHAKVRGTWPFISPEVRKEGGLPQTGREIXD 
PTMGIKPHLWWVAA 


229 


1579 


A 


2589 


I 


448 


DDKNAQGIKRHVKPTSGNAFTICKYPCGKSR 
ECVAPNICKCKPGYIGSNCQTALCDPDCKNH 
GKCIBCPNICQCLPGHGGATCDEEHCNPPCQH 
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GGTCLAGNLCTCPYGFVGPRCETMVCNRHC 
ENGGQCLTPDICQCKPGWYGPTCSTA 


230 


1580 


A 


2593 


2 


138 


AVTFSWFAYVADITQEHERSMAYGLVCMFI 
LYLLYLLRNAFFLR 


231 


1581 




2595 


185 


2 


SGPYTDFTPWPTEEQKLLEQALKTYPVNPPER 
WEKIAEAVPGRTKKAC1KRYKVADJLRISK 


232 


1582 


\ 


2596 


] 


391 


STVTGOFRRLLDTAGHOOPFLELKIRANEPGA 
GRARRRTPTCEPATPLCCRRDHYVNFQELGW 
I0)WI1JLPEGYQLWCSGQCPTHLAGSPGIAAS 
FHSAVFSLLKANNPWPGRTSWCVPTARRPLS 
LLYL 


233 


1583 


A 


2601 


184 


404 


LLFSDEIIMAAPLRIADVTSGLIGGEDGRVYV 

YNGKETTLGDMTOKCKSWITPCPEEKVNVLQ 

NSIPYWERIT 


234 


J 584 


A 


2614 


178 


335 


PLTL CLPENNKPPQADA VPDKELTLP VDSTTL 
DGSKSSDDQKIISYLWEKTQ 


jSJD 


I JO J 


A 

A 


7£1£ 
zolo 


i 




n\/l P WriTnV AQTT?T4PK4riTT TWHV FT FF>I V 
U VLJ2i V xUlvJV /T.O J JCvJT-CiVlLJ 1 i^J^rutliVr<l /ijJL/JL/ Y 

AKFLNVEAAMWGMGFATNSMNIPALVGKG 

CLILRDEVNHTSLVLGARLLGATIGIFKHNYA 

QSLEKLU03AVIYGQPRTRRAWKKXLILVEGV 

YSMEGSIVHLPQIIALKKKYKAYLYIDEAHSI 

ftAVGPTORfrVTFFFGl nPHFVDVT MGTFTKS 

FGASGGYIAGRKARILSPPACLVPNTGSHSLH 

RLTRDLQMNEAMVALVTDRLQGWNSGEGN 

WDRADKFGDLVDYLRVHSHSAVYASSMSPPI 

AEQ1IRSLKL1 MGLDGTTQ 


7^ 


1 JOD 


A 


7^71 


i 


^07 


"NT^^FPADP^PARP^T PHT SOHPSMPT T PT AS 

ADHPQCGRFLPLHEPEPLCPSPSLS YPTL VS S 
WSSPFSSHHGCPPGLYPFPTSPKTIQPPGLAQL 
KMLCIPPGRQQLRGAQSMPGHGALSPLLLPP 
A 


237 


1587 


A 


2628 


398 


1 


DLVCKISGFGRGPRDRSEAVYTTMSGRSPAL 

WAAPETLQFGHFSSASDVWSFGIIMWEVMAF 

GERPYWDMSGQDVIKAVEDGFRLPPPRNCPN 

LMHRLMLDCWQKDPGERPRFSQIHSILSKMV 

QDPEPPNV 


238 


1588 


A 


2631 


1 


1104 


WSPCSLTCGVGLQTRDVFCSHLLSREMNETV 

ILADELCRQPKPSTVQACNRFNCPPAWYPAQ 

WOPCSRTCGGGVO!GIEVIjCKORMAJ3GSFLE 

LPETFCSASKPACQQACKKDDCPSEWLLSDW 

TECSTSCGEGTQTRSAICRKMLKTGLSTWNS 

TLCPPLPFSSSIRPCMLATCARPGRPSTKHSPH1 

AAARK VYIQTRRQRKLHF VGG GFA YLLPKT A 

WLRCPARRVRKPLITWEKDGQHLISSTHVT 

VAPFGYLKIHRLKPSDAGVYTCSAGPAREHF 

VTKLIGGNRKLVARPLSPRSEEEVLAGRKGGP 

KEALOTHKHONGIFSNGSKAEKRGLAANPGS 

RYDDLVSRLLEQGAPCSSSKKKN 


239 


1589 


A 


2636 


1 


678 


MKPDNIIXDEHGHVHITDFNIAaMLPRETQIT 
TMAGTKPYMAPEMFS SRKG AG YSFAVD W W 
SLG VTAYELLRGRRP YHIRS STSSKEIVHTFET 
TVVTYPSAWSQEMVSLLKKLLEPNPDQRFSQ 
LSDVQNFPYMNDnWDAVTQKRLIPGFIPNK 
GRLNCDPTFELFJEMILESKPLHKKXKRLAKK 
EKDMRKCDSSQTCLLQEHLDSVQKEFmNRE 
KVNRDCI 


240 


1590 


A 


2639 


389 


3 


ELLDPTTPMRTKCIELLYAALTSSSTDQPKAD 
LWQNFAREIEEHVFILYSKNIKKYKTCIRSKV 
ANI>KNPRNSHLQQNIJ.SGTTSPRFFAJEMTVM 
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v * ~~ 


241 


1591 


A 


2640 


392 


3 


IRLTELRCVFMRLATICVLVFaGSKITSCDDD 
TCDLCGYNQKLYPCWETQVGQEMYKLMIFD 
FIULAVUFVDFPRKLLVTYCSSCKLIQCWGQ 
QEFAIPDNVLGIVYGQTICWIGAFFSPLLPAM 


242 


1592 


A 


2642 


405 


1 


YFKNmLLVGVICVAAAVEKWNLHKRJALR 
MVLMAGAKmMLLLCFMOTTLLSMWLSNT 
STTAMVMPIVEAVLQELVSAEDEQLVAGNSN 
TEEAEPISLDVKNSQPSVELIFVNEDILDFLMK 
SPLM1SQACI 


243 


1593 


A 


2646 


412 


2 


CLAM1KGIQSSGKIIYFSSLFPYVVLICFLIRAF 
LLNGSIDGIRHMFTPKLEIMLEPKVWREAATQ 
VFFALGLGFGGVIAFSSYNKRDNNCHFDAVL 
VSFINFFTSVLATLVVFAVLGFKANVINEKCIT 
QNSETV 


244 


1594 


A 


2650 


1 


1271 


MTTTLIGLLKTARLLRLVRVARKLDRYSEYG 

AAVLMLLMCIFALIAHWLACIWYA1GNVERP 

YLTDKIGWLDSLGQQIGKRYNDSDSSSGPSIK 

DKYVTALYFTFSSLTSVGFGNVSPNTNSEKIF 

SICVMUGSLMYASIFGNVSAIIQRLYSGTARY 

HMQMLRVKEFIRFHQIPNPLRQRLEEYFQHA 

WTYTNGIDMNMVTNGTCSSCTSDDGHFILVS 

NHHQGGLIYSWNDAASMQRPFNHIKSSLLGS 

TSDSNLNKYSTINKIPQLTLNFSEVKTEKKNSS 

PPSSDKTIIAPBCVKDRTHNVTEKVTQVLSLGA 

DVLPEYKLQAPRINKFTILHYSPFKAVWDWLI 

LLLVIYTAIFTPYSAAFLLNDREEQKRRECGY 

SCSPLNVVDLIVDIK4FIIDILINFRTTYVNQNEE 

WSDPASV 


245 


1595 


A 


2656 


385 


2 


NLTWWPLFRDVSFY1VDL1MLIIFFLDNVIMW 

WESLLLLTAYFCYVWMKFNVQVEKWVKQ 

MINRNKWKVTAPEAQAKPSAARDKDEPTLP 

AKPRLQRGGSSASLHNSLMRNSIFQNK1HTLD 

PHV 


2A6 


1596 


A 


2660 


200 


506 


VLVLQMNYYQMLIIYYVLFFKVNEFLAFEGPI 
U^UMKJJUiLlJuwQLaQATA^ 

IKGSFKQTYLVCLCTSSPNGKLDBEVSMFSFIS 
NYFLS 


247 


1597 


A 


2678 


3 


267 


DAWVKbTOIIFNQTERKQKlSENLKHLASVRV 

VQKNLVFVVGLSQRLADPEVSPLVFFVILIFF 

v^r <iyt crnrriDA at otxqqvuvio 
volo i bi^irurrtyLUJJooJirllio 


248 


1598 


A 


2687 


1 


404 


DFTTLAAMMRT1JFSLFGDVRSDVHRFSVTLF 
GAAIKSVKNPDKKSIENQVLDSLVPLLLYSQD 
ENDAVAEESRQVLTICAQFLKWKLPREVYSK 
DPWHIKPTEAGTICRFFEKKCKGKINILEQTL 
M I oJsJNrKL 


249 


1599 


A 


2692 


1 


440 | 


FRRRRRRRERDCAAQGARRHCRHLAECIOAr" 

SFPIGrYKYLRNVSGQIHLlTLANNELKSLTSK 

FMTTFSQLRELHLEGNFLHRLPSEVSALQHLK 

AEDLSRNQFQDFPEQLTALPALETINLEENEIV 

DVPVEKLAAMPALRSINL 


250 


1600 


A 


2693 


459 


21 


LLPGSLGVPILHSQPWDPSPQCPHRAPSTPRRL 
PPLGALSQALTFLSRAAKNHSQDPGKGTKPFP 
AAPAAPPPRSSLPAPLPMGLKDKGPQPAPPTIF 
NSPWHPATLPGALOPQLSQAAPSPIPPPCLMG 
ISSCPDLKLTFCSSTP 


251 


1601 


A | 


2694 


2 j 404 


FVFDLKLRVPGFAALLIHGASSVPGPETVRLR 
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QKRKKKAPDHS SGRKEELVTTHTVDKLETKK 
PVGRVLCGLSGELLHSLLLPRRKTEKRALGSH 
RKAGFPEHPVAPEPLSNSCQISKEGREQVLSE1 


252 


1602 


A 


2697 


421 


1 


PQKSHSGAYQCTATRKAQTAQDFAIIALEDG 

TPT11VS55FSFKVVNPGEOFSLMCAAKGAPPPT 

VTWALDDEPIVRDGSHRTNQYTMSDGTTISH 

MNVTGPQIRDGGVYRCTAKNLVGSAEYQARI 

NVRGPPSIRAMRNIT 


253 


1603 


A 


2698 


65 


401 


ACCQWRRTLIPAKSTTVSCTISTPHHPFRGSYS 
FDDHTTDSEALSRSSHVFTSHPRMLKRQPAIEL 
PLGGEYSSDVPRPLSTQLSSSLLGYFSTLMTG 
AAFTKNIASSTIIL 


254 


1604 


A 


2699 


438 


301 


GQIHSQDDPPFIDQLGFGVAPGFQTFVACQEQ 

IV V TvLlr W cAuf u Vul 


255 


1605 


A 


2700 


I 


842 


LQNREDSSEGIRKKLVEAEELEEKHREAQVS 

AQHLEVHLKQKEQHYEEKIKVLDNQIKKDLA 

DKETLENMMQRHEEEAHEKGKILSEQKAMIN 

AMDSKIRSLEQRIVELSEANKLAANSSLFTQR 

rNMtL/V^JblliMlbKLKyQJSJ 4 iLbl vJAvj]\X»rSAv2N 

RKLEEQLEKISHQDHSDKNRLLELETRLREVS 

LEHEEQKLELKRQLTELQLSLQERESQLTALQ 

AARAALESQLRQAKTBLEETTAEAEEEIQALT 

VGLGSNIFRLLKASARMSVELALSILAHP 




1 rr\r 

loUo 


A 

A 






4U5 


r VvjUrtrADrr V A VMW UrKAAKMUL 1 A I AJi 
LLKESGNQVLKNGNFSLA1RKYDEA1QILLQL 
YQWGVPPRDLAVIXaSIKSNAFFSLGKWNEA 
FVAAKECXQWDPTYVKGYYIIAGYSLLRLHQ 
PYEAARMFFEGLR 


257 


1607 


A 


2702 


2 


399 


FVESASSRPPGCFSGDGRFWLVSEGSRRGWD 
FNPSFSTLDPRYSVGGDEhnGTVTTLANILREF 
NPSLICGFSVGTGKETSPNAFLNQAVAOGRAE 
DLPVQARRLVDLMKNDTOIHFQEDWKIITLFI 
GGNDL 


258 


1608 


A 


2709 


1 


1097 


SVGARQGEARDRIRRFFPKGDLEVLQAQVERI 
MTRKELLTVYS SEDGSEEFETTVLKAL VKACG 

CQPAC AVI TM7IDT A\7A"\X/\rD\7TYI AHCTT T?T>niTI 
OOISAoA xJuUJDiiKLAV A WISKV UiAvaHLrivOUl 

QWRSFHLEASLMDALLNDRPEFVRLLISHGLS 

LGHFLTPMRLAQLYSAAPSNSLIRNLLDQASH 

SAGTKAPALKGGAAELRPPDVGHVLRMLLG 

KMCAPRYPSGGAWDPHPGQGFGESMYLLSD 

ICATSPr^r DAGT (tOAPWSDT I T WAf I T NP A 

QMAMYFWEMGSNAVSSALGACLLLRVMAR 

I^DAEEAARRJKDLAFKFEGMGVDLFGECYR 

SSEVRAARLLLRRCPLWGDATCLQLAMQAD 

ARAFFAQDGVQSLPTQKWGDMARR 


259 


1609 


A 


2721 


1 


403 


VYLGAGPGLFFSNEGAKEGEKANIPKLMLPR 

SCRRGUKVRLTRVGLAAPSKASRGQEGDAA 
PKSPVREKSPKFRFPRVSLSPKARSGSGDQEE 
GGLRVRLP 


260 


1610 


A 


2728 


1 


477 


LLOGDLRYHLQQNVHFTEGTVKLYICELALA 
LEYLQRYHUHRDIKPDNILLDEHGHVHITDFN 
IATWKGAERASSMAGTKPYMAPEVFQVYM 
DRGPGYSYPVDWWSLGITAYELLRGWRPYEI 
HS VTPIDEILNMFKVERVHY SST WCK.GM VAL 
LRK 


261 


1611 


A 


2730 


3 


547 


LTITDFILVL YRYYRS PL VQIYEIEQHKIETWR 
EjYLQGCFKPLVSISPNDSLFEAVYTT.,IKNRTH 
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SEQID 
NO: of 
nucl- 
eotide 
seq- 
uence 


SE01D 
NO: of 
peptide 
seo- 
uence 


Met 
hod 


ID NO: 
in 

USSN 
09/496 
914 


Predicted 
beginning 
nucleotide 

correspondi 
ng to first 
amino acid 
residue of 
netitirfft 
sequence 


Predicted end 
nucleotide 
location 
wi i cspwiumg 
to last amino 
acid residue 
of peptide 
sequence 


Amino acid sequence (A»Alanine C=Cysteine, 
D=Aspartic Acid, E=Olutamic Acid, 
F^Phenylalanine, Glycine, H«Histidine, 
i— isoieucine, k— Lysine, i-HLEUuuic, 
M=Methionine, N=Asparagine, P=Proline, 
Q=G]utamine, R«=Argmine, S=Serine, 
T«Threonine, V«Valine, W=Tryptophan, 
Y=Tyrosine, X=Unknown, *=Stop codon, 

> pUSolDlC MUCLCUUUC OvICdOn, V^pOSSlDIC 

nucleotide insertion 














RLPVLDPVSGNVLfffl.THK3U.LmvfflFCrSLLP 
RPSFLYRTIQDLGIGTFRDLAVVLETAPILTAL 

niPVnPRVSAI A WWFPnTHPnnTJDT fVt nxki 
uir v v o/\Li/\ v v r»E»v»AJ i riiv^jjxiJvL w 

GLGEPGSEERLFPAAITSR 


262 


1612 


A 


2733 


3 


431 


GPEFPGSAKLVFLDLSYNNLTQLGAGAFRSA 

f»PI WI Q1 AISJTSJMF V/T\7UTJr\APPTT DOT r\\TJ U 
OKLr V IvL»oLf.*\JNiNiNJU VuV nrAJAril I Lfc/oLl^ V Lb, 

LNDNNLRSLSVAALAALPALRSLRLDGNPWL 
CDCDFAHLFSWIQENASKLPKGLDEIQCSLPM 
ESRRfSLRACRRPASRV 


263 


1613 


A 


2736 


2 


343 


PARISGVDPPVRKATKGGENCSFEDNKNWQF 
L W ULNUINr Nr r ISJbr W (jvjfvN NHAKur KTT W 
ARSSSQNNRTFQNNRNFLRLQRDSQKKGQFA 
PXISPLVNLPQSPGGLEFQYQAT 


264 


1614 




?7*ft 

£.100 




LPtJ 


D A TV/IT V/T DT?/* , f-\T>Dt>CVTvTlxrT"tlf TA/^m r>o/"»i rntr 

KAIVUjIvlyLKblJl^rrrh i NW I RJ.JXiPLPSGVRV 
DGDTLGFPPLTTEHSGIYVRHDTNEFSSRDSH 
DTVDVLDPPEDSGKQVDL 


265 


IDl J 


A 

A. 




& 


ICQ 
JOB 


AAOU AFLKbLLl^AN R IkF Pr FSDVKGDHRLV 
LMVETTVLVLIFAVSLLGNVCALVLVARRR 
RRGATACLVLNLFCADLLFISAIPLVLAVRWT 
EAWLLGPVACHLLFYVMTLSGSVTILTLAAV 
SLER 


"266 


1616 


A 


4.1 J J 




1 
I 


ArKJbVOCjyWGLLCfiHLYAlPSKTSEGNWT 

LQGYLPLQDAFHIFQDPLTGDLPWPELILGLP 

V 


267 




A 


9 7/* A 

•6/Ov 




714 


AS RL EKQN STPE S D YDNTPNDMEPDGMG YM 
HRTSVPGEGLPRARDLAGLGQQKQFTTHTPF 
LYFQTHKGLKDSSIRSEVTCLGISQCWRKGFF 


268 


1618 


A 


2762 


1 


405 


IACTFCGQDEWSPERSTRCFRRRSRFLAWGEP 

AVLLLLLLLSLALGLVLAALGLFVHHRDSPL 

VQASGGPLACFGLVCLGLVCLSVLLFPGQPSP 

A DOT Atf"*/*"lDr CUT T>T t/~i <-ti n<n ttt AA A T?Tt></ttii > 

AKCLAijQrLbHLPLTGCLSllJ^QAAEIF 
LPLSWAE 


269 


1619 


A 


2772 


3 


243 


TRPAEKIQYLVLFFVMSHPSQAYDKLSLSDHL 
UAVLNLLRREVSEHGRHLQQYFNLFVMYAN 
LSKNLSFSEFCFDVSY 


270 


1620 


A 


2789 


1 


486 


ELQSQQACTHTKETEQLRSQLQTLKQQHQQA 

LQLSIERGKVLQNKQAEICQLEEKLEIANEDR 
KHALERFEQEAVAVDSNLRVRELQRKVDGIQ 
KAYDELRLQSEAFKKHSLDLLSKERELNGKL 
RHLSP 


271 


1621 


A 


2795 


I 


568 


KEKRVTVQLPTESIQKNQEDKLKMVPRKQRE 

RKHEHLKNKSAPKVVKQKVIDAHLDSQTQN 
FQQTQIQTAESKAEHKKLPQPYNSLQEEKCLE 
VKGIQEKQVFSNTKDSKQEITQNKSFFSSVKE 
^npnnnKTrAi \ttvrpt pitpppf hott ctwad 


272 


1622 


A 


2797 


8 


523 


KCMQGKYAGAMESEPCVCTEADFDCDYGYE" 

RHSNGQCLPAFWFNPSSLSKDCSLGQSYLNST 

GYRKVVSNNCTDGVREQYTAKPQKCPGKAP 

RGLRIVT ADG KLTAEQ GHNVTLMVQLEEGD 

VQRTLIQVDFGDGIAVSYVNLSSMEDGIXHV 

YQNXGIXRXTVQVDNSLGS 


273 


1623 


A 


2801 


72 


395 


HPSRSNVGPRQLTVWNTSNLSHDNRRKYFS 
DEEGQNQLGIRIHQDIPLPPRRRELPALRTTNG 
KADSLNVSRNSVMQELSBLEKQIQV1RQELQL 
AVSRXTELEEYH 


274 


1624 


A 


2805 


168 


320 


ilwlyfetgtwvypvfaklsllgIaalIfslKe 

EFIARNGVVGETLTHCKRV 
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WO 01/57188 
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NO: of 

nuci" 

eotide 

scq- 

uence 


QTPA In 
OJuv^ lis 

NO: of 

seq- 
uence 


1YICI 

hod 


ID NO: 
in 

USSN 
09/496 
914 


rToaiCXCQ 

beginning 

IIUOICULIUC 

location 
correspond! 

to first 
amino acid 
residue of 
peptide 
sequence 


XTCUIClCQ cqq 

nucleotide 

lOuaLLUn 

corresponding 
to last amino 

flciH residue 

of peptide 
sequence 


Amino acia sequence \a /liuninc i^^ysicine, 
D=AsparticAcid, &=01utamic Acid, 

U— Pt»«^«mi tint CXcz.f\ ]\jf*tr\f> l-IaaT-Ti ctiHin#» 
t — r XlCJ iy 1 ill allll 1 C ; VJn^^JiywiKJ, rX^TlloLlUlllC^ 

I=Isoleucine, K=Lysine, L=Leucine, 
M^Methionine, N=Asparagine, P=Praline, 

0==fi Intern inr P=Arofrnnff S^Slprin* 1 

T«Threonine, V=Valinc, W=Tryptophan, 
Y=Tyrosine, X=Unknown, *=Stop codon, 
/possible nucleotide deletion, V=possibIe 
nucleotide insertion 


275 


1625 


A 


2812 


208 


321 


GSLATCQLSBPLLWFILRVLDTSD ALKAFHD 
MGKIIFQ 


276 


1626 


A 


2813 


41 


266 


AORSLHGAGDRAWVGISPTDWSPKWELCK 
KVOCX)TVVAIDLAGDETlPGSSLLPGHVOAY 
QVGPVRRNGEAGPG 


277 


1627 


A 


2817 


3 


410 


VLQERLDNFQRKCIQLASSTEGKVDKLLMRN 
LFISYLHTPKHKQHEVLQAMGSILGITGEEME 
PT FOFFWGTATRWMTfrWI FfrGW WPKTPT 
GLNQQPALNGSFSELFVKFLKTESLSSTLPTX 
LPPHNSPGKIK 


278 


1628 


A 


2821 


A JO 


■*r J f 


nT^HfiP^flSPPH^PT PTTT^BAOT FT AT TCWRNVF 

VKLRl^LHLEELQMEHDIRHYDLESVPMTWD 
PVDQNPRLV 


279 


1629 


A 


2822 


342 


1 


PL1PANLPAHSNPLQPLPSLPHPFLPATHKFPT 

TPPTF9RVPPPT P«?T CQTT MUQPI UQPT XIPT-TT AC 

CRLPSRPSVSRELPPQSGPASSVPLAPTPLPDS 
VPSQRHPTXPPPAS 


280 


1630 


A 


2825 


307 


77 


PSMVWSYHWGVKQKRLALCVFSFEEGGRRK 
mnvwPT Pinner? tp ??r;PT tvtxtt T/^A\/r:T7Pn 

VAFQCDGQRRREPTC 


281 


1631 


A 


2827 


81 


381 


KMGTAVWVPKEKEKRDKASQEGGDVLGAR 
QDCTPSLKSLVATGNLLDLEETAKAPLSTVSA 
>jT r rKi\jrnR\.n>ppn at q^qq wmia/q^ou a cd c 

IN 1 J 1NJV1JJE. V Jrivrv^/VlvOVJOo Y V W VovJv^VAMsJo 

V1LSLTSG 


282 


1632 


A 


2830 


471 


160 


KLPXDKYELEPSPLTQY1LERKSPHTCWQVFV 
TSSGKYNELGYPFGYLKASTTLTCVNLFVMP 
YNYPVLLPLLDDLFKVHKLKPNLKWRQAFDS 
YT KTI PPYYI 


283 


1633 


A 


2835 


462 


148 


VSPALSLTPTIFSYSPSPGLSPFTSSSCFSFNPEE 
MKHYLHSQACSVFNYHLSPRTFPRYPGLMVP 
PLOCONn4PEES']'OP"SlICLOPPPVGRKNRERVE 
SSEESAP 


284 


1634 


A 


2836 


2 


384 


KTLPRTLLD1LADGTILKVGVGCSEDASKLLQ 
DYGLVVRGCLDLRYLAMRQRNNLLCNGLSL 
KSLAETVLNFPLDKSLLLRCSNWDAETLTED 
QVJYAARDAQISVALFLHLLGYPFSRNSPGEK 
KR 


285 


1635 


A 


2843 


20 


271 


PIl^YYSYSGLDRIX:SVSa,PLAK^WLPDVMlL 
VCDRVSEIX}INRQQAQEWCIKHGFELVELSP 
EELPEEDGKCLCVRRKYGTYI 


286 


1636 


A 


2845 


197 


278 


TAEDVLTVAYEHGVNLFDTAEVYAAGK 


287 


1637 


A 


2851 


2 


427 


FVAEVRREWAK YMF VHFK ASFTNSFT HP AM 

NUWGNLRLUGPl^ 

AVLQNLKRILAKVQEMRDQRVSLEQQLREU 

QKDDITGSLVTTDHSQMKKLFEEQLKKYDQL 

KVYLEQNLAAQDRVLCALT 


288 


1638 


A 


2859 


2 




FVNT GIT TPIRr^frTTTOKMftAHr^PTO^T FT HK" 

LGTSELLPAKNV GWSFNDIMEANLPSPSPKP 
TPSSDMTVRKE YITAKYVDHRFSRKTCSTS S A 
KLI^LLEAJKSRDLLALIQVYAEGVELMEPLL 
EPGQELAETALHL AVRTAD QTSLHLVE 


289 


1639 


A 


2861 


2 


454 


F VASGGPAT ARMS DSQFFC VAEERSGHC A W 

DGNFLYVWGGYVSIEDNEVYLPNDEIWTYDI 

DSGLWRMHLMEGELPASMSGSCGACINGKL 

YIFGGYDDKGYShnaYFVmRTRDETYIWEK 

ITDFEGQPPTPRDKLSCWVYKDRLIYFG 


290 


1640 


A 


2868 


1 


378 


FRQGQLYKVFLHGSQGQVYHSQQVGPPGSAI 
SPDIXLD SSG SHL YVLTAHQ VDRIPVAACPQF 
PDCASCLQAQDPLCGWCVLQGRCTRXGQCG 
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Seqid 

NO: of 
nucl- 
eotide 
seq- . 
uence 


S^QlD 
NO: of 
peptide 
seq- 
uence 


Met 
hod 


SfiQ 

IDNO: 

in 

USSN 
09/496 
914 


Predicted 

beginning 

nucleotide 

location 

correspondi 

ng to first 

amino acid 

residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue 
of peptide 
sequence 


Amino acid sequence (A^Alanine OCysteine, 
D=A$Dartic Acid. E=01ufam ic Acid 

Phenylalanine, (HSIycine, H=Histidme, 
I=Isoleucme, K=Lysine, b=Leucine, 
M=Methionine, N=Asparagine, ^Proline, 
QKjmtamine, R^Arginirie, S^Serine, 
threonine, V=Valine, W=Tryptophan, 
Y-Tyrosinc, X=Unknown, *=Stop codon, 
/possible nucleotide deletion, V'possible 
nucleotide insertion 














RAGQLNQWLWSYEEbSHtLHIOSLLPGHHPRn 
IQE 


291 


1641 


A 


2870 


1 


385 


FRYMPNNRQQLLRKRHIGNDrVTIVFQEPGAL 

PFT?KSIR5HFOHVFVIVKVHNPCTENVrYSV 

GVSRSKDVPPFGPPIPKGVTFPKSAVFRDFLL 

AKVINAENAAHKSEKFRAMATRTRQEYLKD 

LA 


292 


1642 


A 


2877 


3 


188 


RPTRPPPATTQSPESTMDTSLKKEKSAILDLYI 
PPPPAVPYSPRYVAVHCHGMLVSCWCHL 


293 


1643 


A 


2878 


] 


427 


REKEEEVEEEEDKVWFTFK'KA FflPKPPn QT 

GAGTHPDAAIPSGERTCGSEGSRSVLDLVNYF 
LSPEKLTAENRYYCESr A ST .fit) A PK" WPr 

GPCVIJLTUJ^SFDLRTMRRRKILDDVSIPLL 
LRLPLAGGRGQAYDL 


294 


1644 


A 


2879 


109 


245 


QLCCFCFRQTTLIVYILSFIGMVU'-l'KrLDLRYI 
IIVFVTGGVLG 


295 


1645 


A 


2880 


3 


320 


NNCVGEQNHRFFCALHCKSKHFCIEFTLNTNF 
FMrFLPfiA'P'KSTTnAPP^T HPTTr nnsvvivTTAr c 

LSESISQ 


296 


1646 


A 


2892 


209 


363 


SQYSHSLDYHLLQV'rKNPFTLGDSSNrcQTE 
RLQEFSQKMDQVRGHWPVST J 


297 


1647 


A 


2893 


8 


424 


SPXTLXLDTFJXLGIQDNILVLILATPPFMAGG 
KLYSTMGRFLRDRKNPACREMAVVLLANLA 
OGD5?I.AARATAVnTifn^Tf»wr T fitzi PncT a at 

QIQQSQASLLHMHNPPFEPTSVDMMRRACRA 
LLALAKVDDNHSEF 


298 


1648 


A 


2894 


310 


445 


FW1YFPSFFMTGYLPLGFEFAVEITYPESEGTS 
SGLLNASAQVNL 


299 


1649 


A 


289S 


1 


492 


GYFQAYN VLiLTMQASLPKVLRFCAC AGMIY 
LGYTFCGWIVLGPYHDKFENLNTVAECLFSL 
VNGDDMFATFAOIOOKSILVWI F<?RT VI V<5F1 

SLFIYM1LSLFIALITDSYDTIKKFQQNGFPETD 
LQEF 


300 


1650 


A 


2901 


1 


445 


PWVWSLNGASEVTFSVHVKDGGSFPKTDSt" 
TVTVRFVNKADFPKVRAKEQTFMFPENQPVS 
SLVTTITGSSLRGEPMSYYTASGNT OMTPOTHn 

LTGQVSISQPLDFEKIQKYVYWIE^RIX3GYPP 
FSSYEKLDITVLDVNDNAPIF 


301 


1651 


A 


2902 


162 


433 


THFICI.PLGYCFPLLDKDLQLPSGFNCNFDFLE 
EPCGWMYDHAKWLRTTWASSSSPNDRTFPG 
KPAVSi^MKELRPACSTYFNPRFPYKL 


302 


1652 


A 


2909 


2 


412 


GPQMLCKKlYFIWmSQCQFEWLADIMQEV 
EENDHQDLVSVHIYVTQLAEKFDLRTTMLY1 
CERHFQKVLNRSUn"GUlSrTHFGRPPFEPFFN 
SLQEVHI^VRIOGVFSCGPPGMTKNVEKACQ 
LVNRQDRAHFM 


303 


1653 


A 


2914 


291 


453 


KLNRWLCFFYSWSFGILLYEMVTLGAPPYPE 
WPTS11JEHLQRRKIMKRPSSCS 


304 


1654 


A 


2926 


179 


354 


PGWSQAiRKAESLKKCLSVMEAKVKAQTAP 
NKDVQRE1ADLGEVGAASLPPSSGPGA 


305 


1655 


A 


2938 


135 


438 


GMGYLhlAKGlLHOKDlJCSKNVFYDNGKVVIT 
DFGLFSISGVLQAGRREDKLRIQNGWLCHLA 
PEIIRQLSPDTEEDKLPFSKHSDVFALGTIWYE 
LHAREWP 


306 


1656 


A 


2944 


2 


329 


VRWNSCVNCSCAFGNGASLSTSLGESSGCLW " 

EIG1CWLSCSLLSFPSPLAVU1TFC3VTVLGREA 

LTKGAXWAVFLLAGSALLCAEVTGV1WRQPE 
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InU. OX 

nucl- 
eotide 
seq- 
uence 


CTD/~\ IT* 

orAl ID 
Fit/. 01 

peptide 
seq- 
uence 


Met 

DOu 


in 

USSN 
09/496 
914 


rreuictea 

beginning 

nucleotide 

location 

correspondi 

ng to first 

amino aciu 

residue of 

peptide 

sequence 


Predicted end 
□ucleotide 
location 
corresponding 
to last amino 
acid residue 
oi pepuue 
sequence 


Amino acid sequence (A^Alanine Q = Cysteine, 

l/^ASparUC AC 10, fi-HjlUKuIIlC rlwlu, 

F=Phenylalanine, G=<Hycine, HNHistidine, 
I=Isoleucine, K=Lysine, Lf=Leucine, 
M=Methionine, N=Asparagine, P=ProIine, 
Q=<jlutamine, R^Arginine, S^Serine, 

Y^Tyrosine, X=Uiiknown, *=Stop codon, 
/possible nucleotide deletion, \ppossible 
nucleotide insertion 














SKTKLSFKVSSSA 






A 




Z 


.... 

*H 1 


MVT PTAlTWQAn-^AlwrnVTRT WnVPPVTFMOT 
IN X JLrl^l/VTkJi OAVJOr\LVlAJX\.i X\l> V Y^V i 1 V ICriNVJJU 

PDl^TTEGSHAFLPCKARGSPEPMTWDKDGQ 

CTAENAVGRARRRVHLTILVUPVFTTLPGDRS 
LRLGDRLWLR 


308 


1658 


A 


2951 


1 


407 


PTRPPRVRFDNEFDAESQRKRT1SVSKMEKM 
DSSLPEEEEDEDKEAINGSGNAENRERHSESS 
u wjmn.i vra ir^v^ir^ooJVUJrKiN i iviJYiixLijc.iLiir 
LPKN WEKtA YTDTGMI YFIDHNTKTTTWLDP 
RLCKKAKAPEDC 


309 


1*59 


A 


2954 


2 


179 


QDFLTLTLTEPTGLL YVGAREALFAF SMEALE 
LQGAVRGGAVOGSRACQRARPRGAVLG 


310 


1660 


A 


2959 


1 


419 


QDMMERAIIDTF VGH DWEPG S YVQMFP YPC 
YTRDDFLr VliiHMIVlrl^CMVtDW V YoVAMl 1Q 
HIVAEKEHRLKEVMK1MGL1WAVHWVAWFI 
TGFVQLS1SVTALTAILKYGQVLMHSHWIIW 
LFLAVYAVATIMFCF 


311 


1661 


A 


2963 


3 


465 


MKPQMPGLGAPNGYGPGRGRAGVPGGPERR 

P W VPHLLPr b SPG Y VMKAQKFCjACjE CjMK 

PQKPGLRGTLKPQKSGHGHENGPWPGPCNA 

RVAPMLLPRLPTPGVPSDKEGGWGLKSQPPS 

AVQNGKLPGHQPPNGYGPGAEPGFNGGLEPQ 

KI 


312 


1662 


A 

A 


2967 


3 


405 


WL AQE W Sr CI VTCOQGLR YKV VLOIDHKuM 

HTGGCSPKTKPECKEECIVPTPCYKPKEKLPV 

EAKLPWFKQAQELEEGAAVSEEPSFIPEAWS 

ACTVTCGVGTQVRIVRCQVLLSFSQSVADLPI 

DECEGPKPA 






A 




z 


43 U 


\ a; a nxirD nnvT hat "d t?t udu/it tvud\/t vt/nr 
V V AUlNCK-l^u X LlJAJLKr L.JbrvKVjL 1 JSJir V.L W 1 

LVSKEPPAPADGNWDAGCDQRRKGGLSLNW 

KVPHVQVKDVPNFEQLSPFXEAALKKACTRD 

PSRWARFWHSGPGQVLTYLLLPCTLPFEYIYF 

RSRRLWWLPDVPADLWWMQ 


314 


1664 


A 


2971 


422 


33 


LDXSHNALQRLRPGWLAPLFQLRALHLDHNE 
r nAi nxi <^\/pw a cm pt t t\t q^mtt p at rmxi 

LUALAJKAjrVr VjNA001-JtUjlJJl-iOo£N J. J-JUU^LTKxl 

DLDGLGAI^KLLLFNNRLVHLDEHAFHGLRA 
LSHLYLGCNELASF SFDHLHGLS ATHLLTLDL 
SSNRM 


315 


1665 


A 


2973 


1 


525 


ITYSTHASGSPFGLEPQ SGWLWVRAALDREA 

r»PT VTT VVMAV^nCIf APT rtfV^TY»TA*r\/PV^T 

LNQNEHSPRLSEDPTFLAVAENQPPGTSVGRV 
FATDRDSGPNGRLTYSLQQLSEDSKAFRIHPQ 
TGEVTTLQTLDREQQSSYQLLVQVQDGGSPP 
RSTTGTVHVAVLDLNDNT 


j 10 


lOOO 


A 




Z 


Ann 


GTDANVYLTIYGEEYGDTGERPLKKSDKSNK 
FEQGQTDTFT1YAIDLGALTKIRIRHDNTGNR 

DDGQLSRE 


317 


1667 


A 


2981 


3 


440 


VLNCQGRPm'VRINGDGQEVXYLAESDNVR 
LGCPYVLDPDDYGPNGLDIEWMQVNSNPAH 
HRENVFLS YQDKRINHG SLPHLQHRVRFAAS 
DPSQYDASINLMNLQVSDTATYECRVTCKTTM 
ATRKVIVTVQARPAVPMCWTEGQ 


318 


1668 


A 


2995 


119 


414 


LPEK£FPIIRKSSSLKVTKCLFTEQPKPI11LRFA 
ENYDARLLRJDIANTLREQVQELFNKTYGKQ 
RRTPGEGHVAAVDREVAGFPVPAEGISGETIH 


319 


1669 


A 


2999 


2 


332 


GFFAYTYGRLWVEDLHSGAQQHWSGHSAEI 
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SEQID 
NO: of 
peptide 
seq- 
uence 


Met 
hod 


SEQ 
ID NO: 
in 

USSN 
09/496 
914 


Predicted 
beginning 

nuclentiHp 

location 
correspond 
ng to first 
amino acid 
residue of 
peptide 
sequence 


Predicted end 
nucleotide 

IfifMittrm 
1UUU1UU 

corresponding 
to Inst amino 
acid residue 
of peptide 
sequence 


Amino acid sequence (A-Alanine 0=Cysteine, 
D-Aspartic Acid, E=GIutamic Acid, 
r^rnenyiaianine, u^uiycme, H = Histidine, 
Hsoleucine, K=Lysine f L^Leucine, 
M=Methionine, N=Asparagine, P»Proline, 
Q=Glutamine t R=Arginine, S=Serine, 
i "iiircviiuie, v** valine, w^irypiopnan, 
Y=Tyrosine, X=Unknown, *=Stop codon, 
/=possible nucleotide deletion, V=possible 
nucleotide insertion 














YSGOLCQlilFPHSTTVlj^SP^ 
GDHDORTLALWGTGHL 


320 


1670 


A 


3000 


693 


322 


u/lo i uJUU 1 vfi i ID i si K. I o YMMNV SATDQA 
PPFNQGFCSVYITLLNELDEAVQFSNASYEAA 
ILENLALGTEIVRVQAYSIDNLNQrTYRFDAY 

T^TOATiTAT PfcTH A TTVT? ClXklfirin A DTCTtdt 
1 0 1 \{e\S%J\LirKJJJ/\X I V t\\j Wvjy UArr r rl 


321 


1671 


A 


3001 


6 


383 


RDPRGKACXTVLGRSTGELEGFASSRLPPQPC " 
GWGQS SDLLSRIDLDELMKKDEPPLDFPDTLE 

WC I /iri\j&A.uyJLJvClIA.i vJEfT VriN YlthHI.HR 

WNQKRYEALGEUTKYVYELLEKDCNSKKVS 


322 


1672 


A 


3007 


192 


447 


xvevv luNai^r r LiKlj4^o^l^AV-.Uroor V W VrLr. I Cjr 

LFPWLFLQVEVIKKAYMQGEVEFEDGENGK 

Hfi A A ^PRNWrVTMTVTT AWA1 ABU 


323 


1673 


A 


3019 


18 


245 


KELLFYHLrVNNINFFNTRYAKJfflPIIASVSEH " 

QPTTWVSFFFDLHn^VCTFPAGLWFCIKNIND 

ERVFGKRGF 


324 


1674 


A 


3020 


523 


797 


LCYFSARYHQRKIFGTXYTJFTLSAINKKEPNLFI " 
i Lr lr r JCMtioHb Y 1 HAuVQRHNLNSLQPLPPG 
FKRFSCLCFLSSWNYRGAPPGPANF 


325 


1675 


A 


3022 


2 


156 


NDFLPLYFG WVLTKKS SETLRKAGQ VFLEEL 
GNHKAFKKELRQCRWQVGAL 


326 


1676 


A 


3023 


38 


172 


KMVRGSKKLJSFFPGGPYGILAGRDPSKGLAT " 
FCLNKEALKDEFE 


327 


1677 


A 


3027 


1 


385 


LTLEFLLLPAASELAHGKRLACCrVl)HKXPEC 
GFYGLYDKILLFKHDPTSANLLQLVRSSGDIQ 
EGDLVWVLSASATFEDFQIRPHALTVHSYRA 
PAFCDHCGEMLFGLVRQGLKCDGCGLNYHK 
RC 


328 


1678 


A 


3030 


13 


569 


ITRPTISCQRPGPGLAAGMI .PYTVNFKVSART " 

LTGALNAHNKAAVDWGWQGLIAYGCHSLV 

VVID SITAQTLQ VLEKHKAD WKVK WAREN 

YHHNIGSPYCLRLASADVNGKIIVWDVAAGV 

AQC^IQEHAKPIQDVQwLWNQDASRDLLLAI 

HPPKYIVLWNADTGTKLWKKSYADNILSFSF 

D 


329 


1679 


A 


3038 


90 


744 


SVNLPPSLWPWEEAMDSTKSEPLKGSPEAED 

GMEYKKLWPSQYRFBHLVTQMKWRLQEG 

kajiiay i y Uj v fclJNuLL VvjLAjtiJbbMRASJ^TL 

HRMAEKVGADITVLREREVDYDSDMPRKITE 

VLVRKVPDNQQFLDLRVAVLGNVDSGKSTL 

LGVLTQGELDNGRGRARJLNLFRHLHEIQSGR 

TSSISFE1LGFNSKGEVHGINGTQWGQTLRMG 


330 


1680 


A 


3040 


3 


397 


LCSTLLLLTIPSWVLSQITLKESGPTLMKPTET " 

LTLTCTFSGFSLNTSGVGVAWIRQPPGKALE 

WLALIYWDDDKRYSPSLNDRLTIAKDTSRNQ 

VVLTMTNMGPVDTAn r YCAQFARGARGSN 

WFDPWGQ 


331 


1681 


A | 


3043 


3 


1509 

i 


AGIRHEAPPTTSNIUiRRQIDRGVTHLNISGLK " 

MPRGIAJDWVAGNVYMTOSGRDVIEVAQMK 

GENRKTUSGMIDEPHAIVVDPLRGTMYWSD 

WGNHPKJETAAMDGTLRETLVQDNIQWPTG 

LAVDYHNERLYWADAKLSVIGSIRLNGTDPI 

VAADSKRGLSHPFSIDVFEDYIYGVTYINNRV 

FKIHKFGHSPLVNLTGGLSHASDVVLYHQHK 

QPEV1WCDRKKCEWLCLLSPSGPVCTCPNG 

KRLDNGTCVPVPSPTPPPDAPRPGTCNLQCFN 

GGSCFLNARRQPKCRCQPRYTGDKCELDQC 
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NO: of 
nucl- 
eotide 
seq- 
uence 


con ir\ 
NO: of 
peptide 
seq- 
uence 


jviei 
hod 


ID NO: 
in 
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09/496 
914 


rre dieted 

beginning 

nucleotide 

location 

correspond! 

ng to first 

amino acid 

luJIUUv Vji. 

peptide 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue 
of peptide 

Od^li&l lists 


A TYiinrt anirl cpniiPlinp ( A— AlfUiina Vvt/MflP 

ENAspartic Acid, FMjlutamic Acid, 
F=Phenylalanine, G=01ycine, H=Histidine, 
I=Isoleucine, KHLysine, Lr=Leucine, 
M^Methionine, N=Asparagine, P=Prolme, 
Q=Glutamine, R-Arginine, S=Serine, 
T=Threonine, V=Valine, W=Tryptophan, 
Y»Tvrosinc X=Unknown. **Stoo codon. 
A=possible nucleotide deletion, \=possible 
nucleotide insertion 














" V.THCKNtOGTCAASPSGMPTCRCPTGFTGPK:C 
TQQVCAGYCANNSTCTVNQGNQPQCRCLPG 
FLGDRCQYRQCSGYCENFGTCQMAADGSRQ 
CRCTAYFEGSRCEVNKCSRCLEGACWNKQS 
GDATONCIDGRVAPSCLTCVGHCSNGGSCT 
MNSKMMPECQCPPHMTGPRCEEHVFSQQQP 
GHIASILIP 


332 


1682 


A 


3045 


3 


952 


TTTISNFHTQVNRTYCCGTYRAGPMRQISLVG 

LSSLRLQCRSQSDDGPIMWVRPGEQMIPTAD 
MPKSPFKRRRSMNEIKNLQYLPRTSEPREVLF 
EDRTRAHADHVGQGFDWQSTAAVGVLKAV 

LDLHEPPVSQCVQWVDEAKLNQMRREGIRY 
ARIQLCDNDIYFIPRNVIHQFKTVSAVCSLAW 

MTRT KnYWPWF ATOl^JTr^N^nV/fnCfrT TOICR 
niXvi itvy^ I rur v v Ci/\ i v<rN x x>oi> o.nlvil/\^vjl# i virvrv 

ELEVDSQCVRIKTESEEACTEIQLLTTASSSFP 
PASE 


333 


1683 


A 


3046 


497 


167 


SACSTGPELPGRATRSLTRPANQKGCDGDRL 
YYDGCAMIAMNGSVFAQGSQFSLDDVEVLT 
ATLDLEDVRSYRAEISSRNLAVSAPVDTCVG 
CSSKTWKVAPFVRAWWRP 


334 


1684 


A 


3053 


37 


276 


Vl'IDLEEQLNQLTEDNAELNNQNFYLSKQLD 
EASGANDEIVQLRSEVDHLRREITEREMQLTS 
QKQVRRVNKVVRSLEDF 


335 


1685 


A 


3054 


2 


840 


NCEGQNIRYKTCSNHDCPPDAEDFRAQQCSA 

YNDVQYQGHYYEWLPRYNDPAAPCALKCH 

AQGQNLWELAPKVLDGTRCNTDSLDMaSG 

ICOAVGCDROLGSNAKEDNCGVCAGDGSTC 

RLVRGQSKSIIVSPEKREENVIAVPLGSRSVRI 

TVKGPAHLFIESKTLQGSKGEHSFNSPGVFW 

ENTTVEFQRGSERQTFKIPGPLMADFIFKTRY 

TAAKDSVVQFFFYQPISHQWRQTDFFPCTVT 

CGGG 


336 


1686 


A 


3058 


54 


347 


WGKQEAGAHSDSCCLLHTPPRLTPAHSRKA 
LRNSRIVSQKDDVHVC1MCLRAIMNYQVSRG 
AWDWRT fiSPAPPHWOLHTCLPRI AYDPI SLYP 
VLCWGT 


337 


1687 


A 


3059 


2 


709 


JLtSLVELTRFETLTPRFSATVPPCWVEVQQE 
QQQRRHPQHLHQQHHGDAAQHTRTWKLQT 

SNITNIPQIQVTLLKNKAPGLGKVNGLRLCPF 

LEDHKEDILCGPVVVLASGLDLSGHAGMLTLT 

SPKLVKGMAGGKYRSFL1HVKAVNERGTEEI 

CNGGMRPWRLPSLKHQSNKGYSLASLLAK 

VAAGKEKSSNVKNENTSGTRK 


338 


1688 


A 


3060 


85 


384 


KAFYNYHVLELLQMLVTGGVSSQLEQHLDK 
DKVYGVADSCTSLLSGRNRCKLGLLSLHETIL 
SDVNPRNTFGQLFCGSLDLFG1LCVGLYRIIDE 
EELNP 


339 


1689, 


A 


3063 


236 


362 


CFLCLSGDFMVMTIFFNVSRRFGYVAFQNYV 
PSSVTTMLSWV | 


340 


1690 


A 


3065 


3 


1249 


DLWQFTPLHEAASKNRVEVCSLLLSYGADPT 

LLNCHKXSAIDLAPTPQLKERLAYEFKGHSLL 

QAAREADVTRIKKHLSLEMVNFKHPQTHETA 

LHCAAASPYPKRKQICELLLRJCGANINEKTKE 

FLTPLHVASEKAHNDWEWVKIiEAKVNAL 

DNLGQTSLHRAAYCGHLQTCRLLLSYGCDPN 
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peptide 
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to last amino 
acid residue 
of peptide 

seo Lien TP 


Amino acid sequence (A-Alanine C=Cysteine, 
D=Aspartic Acid, E=01utaraic Acid, 
Phenylalanine, G=Glycine, H«Hisndine, 
I=Isoleuclne, K=Lysine, L-Leucine, 
M-Methionine, N=Asparagine, P=Proline, 
Q^Glutamine, R«Arginine, S=Serine, 
^Threonine, V=Valine, W=Tryptophan, 

I— i jiuaiuCj A— UlUuiOmJ, oujp COUOn, 

/possible nucleotide deletion, V=possible 
nucleotide insertion 














HSLOGFTALOMGNENVOOLLQBGT^LGNSP.A 
DRQLLEAAKAGDVETVKKLCTVQSVNCRD1E 
GRQ STPLHF AAGYNRVS WE YLLQHG AD VH 

AYTWrtflT VDT UXT A rCV^UVT:^ 1 ACT T \rvtrr* a 

AJSJL/lS.uuLVrLfuN JvJo Yun YEVAELLVKHGA 

VVNVADLWKFTPLHEAAAKGKYEICKLLLQ 

HGADPTKKNRDGNTPLDLVKDGDTDIQDLLR 

GDAALLDAAKKGCLARVKKLSSPDNVNCRJD 

TQGRHSTPLHLAGK 


341 


1691 


A 


3070 


i 
i 




u VLlrorv^NQLr ADLLAGIES VTSEHN YQTLIA 
NYNYDRJDSEEESVINLLSYNIDGI1LSEKYHTI 
RTVKFLRSATIPVVELMDVQGERLDMEVGFD 
r>LKl^AArlJM I MLiiKJ<VKJnuuLYLGSKDDT 
RDEQRYQGYCDAMMLHNLSPLRMNPRA1SSI 

HI T>\>l/Yr X/TDF\ Al C A XTDTYT rv/ix re/^TKl 

JtlLluVlVLMKUALoArNrDLDuV 


342 


1692 


A 


3073 


463 


3 


RINRCRKPSDADILVPGDTISLIGTTSLRIDYNE 
IDDNRVTAEEVD1LLREGEKLAPVMAKTRILR 
AYSGVRPLVASDDDPSGRNVSRG1VLLDHAE 
RDGLDGnTlTGGKLMTYRLMAEWATDAVC 
KJsJLuIN 1 K_rL 1 J ADLALru a QEr AK Vr 


343 


1693 


A 


3075 


250 


1 


LLIYLAIFAPVAMSALAGVKSVQQVRIRAAQS 
LGASRAQVLWFVILPGALPEILTGLRIGLGVG 
WSTLVAAELIAATRGLGFM 


344 


1694 


A 


3076 


2 


138 


LYFDAYLQSLQVAAISTFCCLUGYPLAWAV 
AHoKra rRNILLLL 


345 


1695 


A 


3078 


469 


3 


LBORGQRIELGEIDRVMQALPDVEQAVTHAC 

VINQAAATGGDARQLVGYLVSQSGLPLDTSA 

LQAQLRETLPPHMVPWLLQLPQLPLIANGKL 

HPY AT PI PT5T VAAADAD ADV A/IOCTTT A A a tic 

SLLGCDVQDADADFFALGGHSLIAMKLAT 


346 


1696 


A 


3082 


404 


2 


QNITSKDLDVRLDPQTVPIELEQLVLSFNHMI 

ERIEDVFTRQSNFSADIAHEIRTPITNLITQTEI 

ALSQSRSQKELEDVLYSNLEELTRMAKMVSD 

MLFLAQADNNQLIPEKKMLNLAHEVGKVFD 

QFEALPE 


347 


1697 


A 


3084 


3 


340 


NELTFKEAEISKLYTKVHPAYRTLLEKRQALE 
UciUUvJLNCjK V I AMrK I QQbl VKLTRD VESGQ 
QVYMQTXNKEQELKTJEASTVGDVRrVDPATT 
QPGVLKPKKGLIILGAI 


348 


1698 


A 


3086 


723 


10 


l y/VM v w ^yJwVCAbUUrv^aCjKJi WLHAATL 
YN1AAYPHLKGDDLAEQAQALSNRAYEEAA 
QRLPGTJVtRQMEFTVPGGAPrTORHMPKGDG 
PFPTVLMCGGLDAMQTDYYSLYERYFAPRG1 
AMLTIDMPS VGFSSKWICLTQDS SLLHQHVLK 
/UjrjN vrw Vi^iilxvVA/U^OrKrvyAlN VAVKLAY 
LESPRLKAVACLGPWHTLLSGLKCQQQVPE 
MYLDVLASRLGMHDASTKSSTRENH 


349 


1699 


A 


3087 


2 


249 


RTRS SDPEITLAGTPLHAAYLIGMTI .1C AGFSV 
GFGVAMSQALGPFSLRAGVASSTLGIAQVCG 
S SL WI WLAA WGIGAWNM 


350 


1700 


A 


3099 


3 


424 


EAPEATPQPSQPGPSSPISLSAEEENAEGEVSR 
ANTPDSDITEKTEDSSVPETPDNERKASISYFK 
NQRGIQYIDLS SDSED WSPNCSNTVQEKTFN 
KDTVUVSEPSEDEESQGLPTMARRNDDISELE 
DLSGMEDLK 


351 


1701 


A 


3108 


2 


404 


nOasTillGYQLLHRRALFEKRTRLSDYALIFG 

MFGIWMVIETELSWGAYYKAPLYSLALKCL 

ISLFT1ILLGLTIWHAREIQLFMANYGADDWR 

SALTYEPIFLILLEALRGVIHATPCRVSLSLWD 

GLDLP 
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sequence 


Ammo acia sequence Ai anm e ^* = \jysieine > 
D^Aspartic Acid, B=Glutamic Acid, 
F^Phenylalanine, G=GIycine, H«Histidine, 
I=lso leucine, K=Lysine, IHLeudne, 

Kjf=\/*»th inni TJ==A Qnfl racing PssProlitiP 

lVl i VI CUll Willi ICj Ji AdUcu aKUtVj 1 A lUlUJ^j 

Q=Ghitamine, R=Argmine, S^Serine, 
T^Thmnninp Vs=Vflline WoTrvntonhan 

A * ill CUUUiV) V t (U UlW) tt A I j L/LV^/iitUig 

Y^Tyrosine, X=Unknown, *=Stop codon, 
/possible nucleotide deletion, V^possible 
nucleotide insertion 


352 


1702 


A 


3110 


341 


2 


AQLAEVCPPQTLLTTNTSSISITAIAAEIKNPER 
VAGLHFFNPAPVMKLVEVVSGLATAAEVVE 
QLCELTLSWGKQPVRCHSTPGFIVNRVARPy 
YSEAWRALEEQVAAPEVI 


353 


1703 


A 


3111 


3 


188 


HFSLFRIAFAWLTYMTVGLPLPVIPLFVHHEL 
GYGNTMVG1AVGIQFLATVLTRGYAGRLA 


1 KA 


17U4 


A 

A 


3110 


30/ 




WHY T?U1 XtnTPI MTf^TTTTYTP QfVMnWWnD QQ 
YT^J-irxlLdNvji rL»INHjiilly l£rol» ViNvjW v i Uivoo 

FPFSNMTEVRGLVFLS 


355 


1705 


A 


3117 


101 


53 


VTNLVYLISSPRPELKPVDKESEVVMKFPDGF 
EKFSPPILQLDEVDFYYDPKHV1FSRLSVSADL 
boKI UVV LrbN uAuKo i mLKlAMUuLVur VKCri 
RHAHRNLKIG YFSQHHVG AAGT* TFS ACGNL 
LGTQVFLGRPEEEYXRHQLGFGMGISGELGHA 
SSLPACLGGQKEAEVAFCSDGLLPCPNFL\TL\ 
DEPTN\HLGHGRAIEALGPCLQTISGVGVILVS 
HE* SALSRLVCRE\LWVC*GRSTSPF 


356 


1706 


A 


3121 


137 


466 


RGGRDWGEHNQRLEEHQARAWQGAMDAG 
AASREHARWQGTGLAPGTRVAVAPTCVQGL 
PQERS VCRPFFS SR WREGPVWALGAGAHGKP 
RWSGGVRCWRGGRWFTPAPH 


357 


1707 


A 


3124 


1249 


229 


MLEAPGPSDGCELSNPSASRVS CAGQMLEVQ 

PGL YFGGAAA VAEPDHLREAGnA VLT VT>SE 

EPSFKAGPGVEDLWRLFVPALDKPETDLLSH 

LDRCVAFIGQARAEGRAVLYHCHAGVSRSV 

AlITAFlJvIKTDQLPFEKAYEKLQILKPEAKMN 

EGFEWQLKLYQAMGYEVDTSSAIYKQYRLQ 

BCVTEKYPELQNLPQELFAVDPTTVSQGLKDE 

VLYKCRKCRRSLFRSSSDLDHREGSGPIAFAH 

KJRMTPSSMLTTGRQAQCTSYFIEPVQWMESA 

LJ^ijYMJL/Uvi^^*AvCa^ roiJVtoL/ 

GRWITPAFQIHKNRVDEMKELPVLGSQTGKI 


358 


1708 , 


A 


3127 


816 


139 


EVETLGPRTPGP/EAQSPTPGSCPGWQEPSPGP 
Tppwi cripnporiAPVT ran t prvPFFTPAnvTP 

LGKHFWWGL\PVTSANFSPGAAA*FGGALSPP 

GGDL/GHMLLQGPPSPFRLQQQ*QTPPGSHSP 

PTANREINPGPAAAADTRSCWGHKRSWRGW 

RfJT APWRT ftFfWPOrP+PAPAOTP/fiRPTWFGG 

K1GAGGKPSETLTRSPPVWRGKRGSANGFLSW 

VQILQ 


359 


1709 


A 


3132 


3 


191 


HEHLLIXLLCVFLVKSQGVNDNEEGFFSARG 
HRPLDKKREDAPNLRPALADVJTVCDYRAQIA 
*AASTPKRAASIAHNAVSCR*AQIA 


360 


1710 


A 


3134 


1 


286 


REPPRPALLFF* DRVSLCCPGWNAV VQSQLT 
AAPTSQVQ/SDSPTFPSSWDYRHVPEYPANFL 
♦RQGFPMLP1UVSNSWAQTVHPPRPPKVLDL 
QA 


361 


1711 


A 


3135 


56 


1449 


PVPAPRVSPSARGAPGRPRLPGVRGPRHS/WA 
AD* RGSRM/PPRAPAPSPTGP/APGGKKVRGR 
VPEDPDAYEPRCSAL*V*PTHVTSPQFCDP*N 

CiOTT} CVFTV7 T "RnT XTFTMT VTC/PT PliRFP/PFA 

GPGRQSTPAVTRDHRQHEDPRGAGRQWDAD 
PRPSAP/PAEVATGSRPGRHMWMRLCLAAQQ 
APGLPHRTSIRPGWRRLTEPEAWARRHRRPW 
GQRGAVRPPPQGAAPPPSHQGRRTNTOPSAT 
PRLTVMSRCLAPDLKAPASGPRGWRRGMPQ 
SS/GALLWTPPPTPRGSHSPRPREAPLRA1HPA 
GPSK/SRAGASGRLPBVtYGWVTLFTPPEAGT 
F/L1PSPT*MSPALVIQPPVPPTQMGLRISGLPR 
QG*PSGAPW*LPGLAQLAFQCHI PHDEVGPP 
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ng to first 

amino acid 

residue of 
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D=Aspartic Acid, E=Glutamic Acid, 
^Phenylalanine, OGIycine, H=Histidine, 
I=Isoleucine, K=Lysme, LFLeucine, 
M=Methionine, N=Asparagine, P=ProIme, 
Q=<3hitamine, R=Arginine, S-Serine, 
^Threonine, V-Valine, W«Tryptophan, 
Y=Tyrosine, X=Unknown, *=Stop codon, 
/^possible nucleotide deletion, V=possible 
nucleotide insertion 














T)>>Tr\crjT /1XTTVTT Poor n» /nnnn r\\ ttj rrn intr*-* 

GHSSPREPQVLKKPLWGQTDIAGVOSASLYP 
DNL 


362 


1712 


A 


3136 


1270 


274 


RVGMVLGTREVGDSTPPPSPPLYPFTGNEFVQ 

HNTWQLSRVYPSDLRTDS SNYNPQELWNAG 

CQM/V*GGSRDWEEGVEEQQVGNKFSSDGR 

VGECSRKLLG*EMLSVDITSRYRAPSTYLLNS 

LKEGLEGLHGESCSSFLLGPSVAMNMQTAGL 

EMDICDGHFRQNGGCGYVLKPDFLRDIQSSF 

HPEKPISPFKAQT1XNQVISVQQLPKVDKTKE 

GSIVDPLVKVQIFGVRLDTARQETNYVENNG 

FNPYWGQTLCFRVLGPDFPMLRFGKMDYDW 

KSRNDLLGKTPCPGTCMQQG YRI IIHLLSKDG 

ISLRPASIFVYICIQEGLEGDES 


363 


1713 


C 


3139 


60 


248 


MFAGSYGKSMFSFSKKVLNCLPKWRYHFVIA 
P AMNESPL APHLHQHLVFSVFQ VLTILIGV* * 


364 


1714 


A 


3140 


57 


418 


S AFKTLQLP AFSL YFDLG SLKLLILRJHTSI VK 
NHKVESPRTMSPG*DPQSFLQIPQPRPPQLRV 
GLTSGLIQHFHSPSSCQFPLLRGPPFPRQPPLGI 
SGASLCPVLSPPR*PLQPSSL 


365 


1715 


A 


3145 


122 


413 


LLPYPSLFVFLRQCHFVARLECNGWSAHCN 
LHLPGSSDSPASAS+VAGTTGVCHHTRLIFWF 
L V*TGFHY VAQ AGLELLTA* S\PPQLPKWGL 
QA 


366 


1716 


A 


3150 


247 


2 


VGEKLHDIRFGNDFDMTPKAQATKEKIDKLN 
FIKIKKLCIEGYY/NREPQNGRKIFANYVS\DK 
GLMATIYEELLKLSNKLIQ 


367 


1717 


A 


3152 


3 


2367 


QKLKQNQPKRAHVEDGGSRSKQGNEQSKKT 

PIEKSDFAAATHPRAFYLSKPDETPNAWMSD 

SGTGLTYWKLEEKDMHHSLPETLEKTFISLSS 

TDVSPNQVLTLDPTLHMKPKQQISG1QPHGLP 

NALDDRISFSPD SVLEPSMS SPSDIDSFSQASN 

VTSQLPGFPKYPSHTKASPVDSWKNQTFQNE 

SRTSSTFPSWTn'SNDISVKTVDEENTVMVAS 

ASVSQSQLPGTANSVPECISLTSLEDPVILSKIR 

QNLKEKHARHIADLRAYYESEINSLKQKLEA 

KEIS G VED WKJTNQIL VDRCGQLDSALHE ATS 

RVRTLENKNNLLEIEVNDLRERFSAASSASKI 

LQERIEEMRTSSKEKDNTIIRLKSRLQDLEEAF 

ENAYKLSDDKEAQLKQENKMFQDLLGEYES 

LGKEHRRVKDALNTTENKLLDAYTQISDLKR 

MISKLEAQVKQVEHENMLSLRHNSRIHVRPS 

RANTLATSDVSRRKWL1PGAEYSIFTGQPLDT 

QDSNVDNQLEETCSLGHRSPLEKDS SP/GSSST 

SIJLIKKQRETSDTPIMRALKELDEGKIFKNWG 

TQTEKEDTSNSIX*/INPRQTETSVNASRSPEK 

CAQQRQKRLNSASQRSSSLPPSNRKSSTPTKR 

EMLTPVTVAYSPKRSPKENLSPGFSHLLSKN 

ES SPIREKTYSEKATDNHVNHSSCPEPVPNG V 

KKVSVRTAWEKNKSVSYEQCKPVSVTPQGN 

DFEYTAKIRTLAETERFFDELTKEKDQIEAAL 

SRMPSPGGR1TLQTRLNQVKCLSLNLL 


368 


1718 


A 


3163 


2 


2350 


EFKSGGCGAGLVAAGAVLVLYPASRAGERT 

RVPGSPAPSSLPLHSPGACGTEVDMDPQRSPL 

LEVKGNIELKRPLIKAPSQLPLSGSRLKRRPDQ 

MEDGLEPEKKRTRGLGATTKITTSHPRVPSLT 

TVPQTQGQTTAQKVSKKTGPRCSTAIATGLK 

NQKPVPAVPVQKSGTSGVPPMAGGKKPSKRP 

AWDLKGQLCDLNAELKRCRERTQTLDQENQ 
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QLQDQLRDAQQQVKALGTERTTLEGHLAKV 

QAQAEQGQQELKNLRACVLELEERLSTQEGL 

VQELQKKQVELQEERRGLMSQLEEKERRLQT 

SEAALSSSQAEVASLRQETVAQAALLTEREER 

LHGLEMERRRLHNQLQELKGNIRVFCRVRPV 

LPGEPTPPPGLLLFPSGPGGPSDPPTRLSLSRSD 

ERRGTLSGAPAPPTRHDFSFDRVFPPGSGQDE 

VFEEIAMLVQSALDGYPVdFAYGQTGSGKTF 

tti /'cn/^'O/ir^'rxrv^vT t?/t.t rnn at tit tt t?c\ t a /"\t?t o/~< 

IMbbGPUUDl^LbGLIPRAL^ 
QG^TYSFVASYVEIYNETVRDLLATGTRKGQ 
GGECEIRRAGPGSEELTVTNARYVPVSCEBCEV 
DALLHLARQNRAVARTAQNERSSRSHSVFQL 

rNTo^troocrj m nrr adt ct \/ni a rcrtti rvm-«T 
V^loUbrib dKULA^V/VJ At LbL V UL AOobKLDr QiL 

ALGPGERERLRETQAINSSLSTLGLVIMALSN 

KISHVPYRNSKLTYLLQNSLGGSAKMLMFV 

N1SPLEENVSESLNSLRFASKVEPSVLFGTAQS 

JNKJS. W lVl Ur ULA^ V l~ V C V C V L. V L. V ^ V CVLVr 

MSMYRVRGGRVAGGCFIGWRAPCPRAIK 


369 


1719 


A 


3165 


365 


12 


GYTSQGRWIDIERGPLT ANTESLI IENNFN ALP 

nvTDvn7*t*TVWvr*t\n:/irai/^iT t "kitx/vtcti c ns 
\j I IKJKJUb* III JfJSSS * 1IN r VjlJ V OLLM I V JsJLa JLLo/Iv 

lYRFDAIPVKlLTRFFINLDKLILKJVLKTKIAK 
NRIKTFxTMRRKKLGDSS 


370 


1720 


A. 


3170 


393 


42 


GASISPSAVIDGVEGLKPMQEQEAQEAGPCLD 

♦TJMAPPnWVAPTJVPT T PPT TPQVT UAT TTAA A A 

QSSAEEDEDPRN*GQSSEDQAPNQNGLIVIVH 
RVHVPLGAAATVPVHRSHFPR 


371 


1721 


A 


3173 


770 


510 


GNGGCGLSQIPPSHLGAFSRGSLLSRGXDPRGP 

PPHPVIFFVFWE\QGFTVLARMVSIS*PCDPP 

ALASQSAOTGVSHLARPQNLYF 


372 


1722 


A 


3180 


381 


76 


RVLHHDNVPAHSSPQKRE1SQEFQLE1RHLP*S 
PDLAPSGCFLFLNLKNIFK\GTHFSLVDNVKK 
TVSTWLH/SQNAQFYKDRLNGWYHCLQKCL 
QHY*AYVEK 


373 


1723 


A 


3181 


410 


14101 


RREVAGPEGKGLLLASAHTMLTPPLLLIXPLL 

SALVAAAIDAPKTCSPKQFACRDQITCISKGW 

RCDGERDCPDGSDEAPEICPQSKAQRCX^PNE 

HNCLGTELCYPMSRLCNGVQDCMDG SDEGP 

HCRELQGNCSRLGCQHHCVPTLDGPTCYCNS 

SFQLQADGKTCKDFDECSVYGTCSQLCTNTD 

GSFICGCVEGYLLQPDNRSCKAKNEPVDRPP 

VLLIANSQNILATYLSGAQVSTITPTSTRQTTA 

MDFS YANETVCWVHVGD SAAQTQLKCARM 

PGLKGFVI>EHTMSLSLHHVEQMAIDWLTGN 

FiTVDDIDDRIFVCNRNGDTCVTLLDLELYNP 

KGIALDPAMGKVFFTDYGQIPKVERCDMDG 

QNRTKLVDSKIWPHGITLDLVSRLVYWADA 

YLDYIEWDYEGKGRQTIIQGILIEHLYGLTVF 

ENYLYATNSDNANAQQKTSV1RVNRFNSTEY 

QWTRVDKGGALHIYHQRRQPRVRSHACEN 

DO YGKPGGCSDICTX ATsTSHKARTCRPP. SfiFS 

LGSDGKSCKKPEHELFLVYGKGRPGURGMD 

MGAKVPDEHMIPIENLMNPRALDFHAETGFI 

YFADTTSYLIGRQKIDGTEREmKDGIHNVE 

GVAVDWMGDNLYWTDDGPKKTISVARLEK 

AAQTRKTLIEGKMTHPRAIVVDPLNGWMYW 

TDWEEDPKDSRRGRLERAWMDGSHRDEFVT 

SKTVLWPNGLSLDIPAGRLYWVDAFYDRIETI 

LLNGTDRKIVYEGPELNHAFGLCHHGNYLFW 

TEYRSGSVYRLERGVGGAPPTVTLLRSEVRPPI 
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Y=Tyrosinc, X=Unknown, *=Stop codon, 
/^possible nucleotide deletion, \=possible 
nucleotide insertion 












! 


FE!R^CyDA 0 H or> VGSNKGRVNNAGGSSLGL 
ATrcSI^ 

PPQCQPGEFACANSRCIQERWKCDGDNDCLD 

NSDEAPALCHQHTCPSDRFKCENNRCIPNRW 

LCDGDNDCGNSEDESNATCSARTCPPNQFSC 

ASGRCMSWTCDLDDDCGDRSDESASCAYPT 

CFPLTQFTCNNGRC3NINWRCDNPNDCGDNS 

DEAGCSHSCSSTQFKCNSGRCEPEHWTCDGD 

NDCGDYSDETHANCTNQATRPPGGCHTDEF 

QCRLDGLCIPLRWRCDGDTDCMDSSDEKSCE 

G VTHVCDPS VKFGCKD S ARCISKA WVCD GD 

NDCEDNSDEENCESLACRPPSHPCANNTSVC 

LPPDKLCDGNDDCGDGSDEGELCDQCSLNN 

GGCSHNCSVAPGEGIVCSCPLGMELGPDNHT 

CXJIQSYCAKHLKCSQKCDQNKFSVKCSCYEG 

WVLEPDGESCRSLDPFKPFIIFSNRHEIRRIDLH 

KGDYSVLVPGLRNTIALDFHLSQSALYWTDV 

VEDKIYRGKLLDNGALTSFEWIQYGLATPEG 

LAVDWIAGNIYWVESNLDQIEVAKLDGTLRT 

TLLAGDIEHPRAIALDPRDGILFWTDWDASLP 

REEAASMSGAGRRTVHRETGSGGWPNGLTV 

DYLEKRILWDDARSDAIYSARYDGSGHMEVL 

RGHEFLSHPFAVTLYGGEVYWTDWRTNTLA 

KANICWTGHNVTVVQRTNTQPFDLQVYHPSR 

QPMAPNPCEANGGQGPCSHLCLINYNRTVSC 

ACPHU1KLHKDNTTCYEFKKFLLYARQMEIR 

GVDLDAPYYNYnSFTVPDIDNVTVLDYDARE 

QRVYWSDVRTQAIKRAFINGTGVETWSADL 

PNAHGLAVDWVSRNLFWTSYDTNKKQINVA 

RLDGSFKNAWQGLEQPHGLWHPLRGKLY 

WTDGDN1SMANMDGSNRTLLFSGQKGPVGL 

AIDFPESKLYWISSGNHTINRCNLDGSGLEVID 

AMRSQLGKATALAIMGDKLWWADQVSEKM 

GTCSKADGSGSWLRNSTTLVMHMKVYDES] 

QLDHKGTNPCSVNNGDCSQLCLPrSETTRSC 

MCTAGYSLRSGQQACEGVGSFLLYSVHEGIR 

GffLDPNDKSDALVPVSGTSLAVGlDFHAEND 

TIYWVDMGLSTISRAKRDQTWREDVVTNGIG 

RVEGIAVDWIAGNIYWTDQGFDVIEVARLNG 

SFRYWISQGLDKPRAITVHPEKGYLFWTEW 

GQYPRIERSRLDGTERWLVNVSISWPNGISV 

DYQDGKLYWCD ARTDKIEIUDLETGENRE V V 

LSSNNMDMFSVSVFEDFIYWSDRTHANGSIK 

RGSKDNATDSVPLRTGIGVQLKDDCVFNRDR 

QKGTNVCAVANGGCQQLCLYRGRGQRACA 

CAHGMLAEDGASCREYAGYLLYSERTILKSI 

HLSDERNLNAPVQPFEDPEHMKNVIALAFDY 

RAGTSPGIPNR^FSDIHI^GNIQQINUDGSRRIT 

IVENVGSVEGLAYHRGWDTLYWTSYTTSTIT 

RHTvDQTRPGAFERETVITMSGDDHPRAFV^ 

DECQNLMFWTNWNEQHPS1MRAALSGANVL 

TLIEKDIRTPNGLAIDHRAEKLYFSDATLDKIE 

RCEYDGSHRYVDLKSEPVHPFGLAVYGEHIF 

WTDWVRRAVQRANKHVGSNMKLLRVDIPQ 

QPMGHAVANDTNSCELSPCRINNGGCQDLCL 

LTHQGHVNCSCRGGRILQDDLTCRAVN SSCR 

AQDEFECANGECINFSLTCDGVPHCKDKSDE 

KPSYCNSRRCKKTFRQ C SNGRCV SNML WCN 

GADDCGDG$DEIPCNKTACGVGEFRCRDGTC 

1GNSSRCNQFVUCEDASDEMNCSATDCSSYF 
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correspond! 

ng to first 

amino acid 

residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue 
of peptide 
sequence 
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Y«Tyrosine, X«Unknown } *=Stop codon, 
/^possible nucleotide deletion, \=possible 
nucleotide insertion 














RLGVKG VLFQPCERTSLCY APS WVCDGAND 

CGDYSDERDCPGVKRPRCPLNYFACPSGRCIP 

MSWTCDKEDDCEHGEDETHCNKFCSEAQFE 

CQNHRCISKQWLCDGSDDCGDGSDEAAHCE 

GKTCGFSSFSCPGTHVCVPERWLCDGDKDCA 

DGADESIAAGCLYNSTCDDREFMCQNRQCIP 

KHFVCDHDRDCADGSDESPECEYPTCGPSEF 

RC ANGRCLSSRQ WECDGENDCHDQ SDEAPK 

NPHCTSPEHKCNASSQFLCSSGRCVAEALLCN 

GQDDCGDSSDERGCHINECLSRKLSGCSQDC 

EDLKIGFKCRCRPGFRLKDDGRTCADVDECS 

TTFPCSQRCINTHGSYKCLCVEGYAPRGGDP 

HSCKAVTDEEPFLIFANRYYLRKLNLDGSNY 

TLLK(^Lr^AVALDFDYREQMIYWTDVTrQ 

GSMTRRMHLNG SNVQVLHRTGLSNPDGLAV 

DWVGGNLYWCDKGRDHEVSKLNGAYRTVL 

VSSGLREPRALWDVQNGYLYWTDWGDHSL 

IGRlGMDGSSRSVIVDTKrrWPNGLTLDYVTE 

RIYWADAREDYIEFASLDGSNRHWLSQDIPH 

1FALTLFEDYVYWTDWETKSINRAHKTTGTN 

KTLLISTLHRPMDLHVFHALRQPDVPNHPCK 

VNNGGCSNLCLLSPGGGHKCACPTNFYLGSD 

GRTCVSNCTASQFVCKNDKCIPFWWKCDTE 

DDCGDHSDEPPDCPEFKCRPGQFQCSTGICTN 

PAFICDGDN DCQDNSDEANCDIHVCLPSQFK 

CTNTNRCIPGIFRCNGQDNCGDGEDERDCPE 

VTCAPNQFQCSITKRCIPRVWVCDRDNDCVD 

GSDEPANCTQMTCGVDEFRCKDSGRCIPARW 

KCDGEDDCGDGSDEPKEECDERTCEPYQFRC 

KNNRCVPGRWQCDYDNDCGDNSDEESCTPR 

PCSESEFSCANGRCIAGRWKCDGUHDCADGS 

DEKDCTPRCDMDQFQCKSGHCIPLRWRCDA 

DADCMDGSDEEACGTGVRTCPLDEFQCNNT 

LCKPLAWKCDGEDDCGDNSDENPEECARFV 

CPPNRPFRCKNDRVCLWIGRQCDGTDNCGD 

GTDEEDCEPPTAHTTHCKDKKEFLCRNQRCL 

SSSLRCNMFDDCGDGSDEEDCSIDFKLTSCAT 

NASICGDEARCVRTEKAAYCACRSGFHTVPG 

QPGCQDINECLRFGTCSQLCNNTKGGHLCSC 

ARNFMKTHNTCKAEGSEYQVLYIADDNEIRS 

LFPGHPHSAYEQAFQGDESVRIDAMDVHVKA 

GRVYWTNWHTGTISYRSLPPAAPPTTSNRHR 

RQlDRGVTHLNISGLKMPRGIAroWVAGNVY 

WTDSGRDVIEVAQMKGENRKTLISGMEDEPH 

AIWDPLRGTMYWSDWGNHPKIETAAMDGT 

LRJBTLVQDNIQWPTGLAVDYHNERLYWADA 

KLSWGSIRLNGTDPIYAADSKRGLSHPFSrDV 

FEDY1YGVTYINNRVFKIHKFGHSPLVNLTGG 

LSHASDVVLYHQHKQPEVTNPCDRKKCEWL 

CLLSPSGPVCTCPNGKRLDNGTCVFVPSPTPP 

T)T\ A T)T) D/yrVKTr f^r»T7\T/l/iC'/^CT XT inn Am//in 

rl/ArKrul CWl^vjCrNuOoCrLNARRQPK.CRC 

QPRYTGDKCELDQCWEHCRNGGTCAASPSG 

MPTCRCPTGFTGPKCTQQVCAGYCANNSTCT 

VNQGNQPQCRCLPGFLGDRCQYRQCSGYCE 

NFGTCQMAADGSRQCRCTAYFEGSRCEVNK 

CSRCLEGACVVNKQSGDVTCNCTDGRVAPS 

CLTCVGHCSNGGSCTMNSKMMPECQCPPHM 

TGPRCEEHVFSQQQPGHIASILIPUXLLLLVL 

VAGWFWYKRRVQGAKGFQHQRMTNGAM 

NVEIGNPTYKMYEGGEPDDVGGLLDADFAL 
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DPDKFTNFTNPVYATLYMGGHGSRHSLASTD 














EKRELLGRGPEDEIGDPLA 


374 


1724 


A 


3187 


191 


1815 


Q^LASAGKIPEESKALSLLAPArTMTSLMPG 

AGLLPIFTPNPLTTLGVSLSSLGAIPAAALDPN1 

ATLGEIPQPPIJ4GNVDPSKroEIRRTVYVGNL 

NSQTTTADQ1XEFFKQVGEVKFVRMAGDET 

QFTRFAFVEFADQNSVPRALAFNGVMFGDRP 

LKINHSNNAIVKPPEMTPQAAAKELEEVMKR 

VREAQSFISAAEEPGWLHSTSLCNDFLGCF* RR 

RM YRb* APCTICO rFHLCLIINWDL'Lr* AYTA 

K*FFPPRVWKEQ*KKRR\RSRSHTRSKSRSSSK 

SHSRRKRSQSKHRSRSHNRSRSRQKDRRRSK 

SPHKKRSKSRERRKSRSRSHSRDKRKDTREKI 

iUiiUiK V KiiKJJKJbKJiR^ 

KDRDKEREKDREKDKEKDREREREKEHEKD 

RDKEKEKEQDKEKEREKDRSKEIDEKRKKDK 

KSRTPPRSYNASRRSRSSSRERRRRRSRSSSR5 

HISERRERERSTSMRKSSNDRDGKEKLEKNST 
S 


375 


1725 


A 


3192 


415 


101 


AHSSHQTRADuQEFQWDIBRHPPL\SPNLALSG 
F\FPNLKKSLRGTHFSSVKK\TTLTWLNSQDP 
WF/FFYP* SPDLQIPSSFRNGLND WYHHSQKC 
PDLDGAYVKK 


376 


1726 


A 


3199 


931 


418 


GV* WCDLGSPQPPPPGFKQFCLGRSSS WD YR 
HVPPHPANFVFLLETGFLHAGQAGLNGDPPAS 
ASQSAGITGVSHTWPKNHLIFYACLVIRSKRJ 
K 




1 /Z/ 


A 


3201 


274 


1285 


KTGYTSRGSPLSPQSSIDSELSTSELEDDSlSM 

GYKLQDLTDVQIMARLQEESLRQDYASTSAS 

VSRHSSSVSLSSGKKGTCSDQEYDQYSLEDEE 

EFDHLPPPQPRLPRCSPFQRGIPHSQTFSSIREC 

RRSPSSQYFPSNNYQQQQYYSPQAQTPDQQP 

NRTNGDK/PPKKYA*PSPDAKYNCH* +QHVSSP 

VTVRNSQSFDSSLHGAGNGISRIQSCIPSPGQL 

QHRVHSVGHFPVSIRQPLKATAYVSPTVQGSS 

NMPLSNGLQLYSNTGIPTPNKAAASGIMGRS 

ALPRPSLAINGSNLPRSKIAQPVRSFLQPPKPL 

SSLSTLRDGNWRDGCY 


378 


1728 


A 


3202 


112 


1789 


VPGVTESRPSVLRGDHLFALLSSETHQEDPIT 
YKGFVHKV\ELDRVKLSFSMSLLSRFVGWG* 

pfkwy/™rqplrv\qhraleltgrwllw 

PMLmVAPRDVPLLPSDVKLKLYDRSLESNP 

eqlqamrhivtgttrpapyiifgppotgktvt 
lveaikqvvkhlpkafiilacapsnsgadllc 
qrlrvhlpssiyrllapsrdirmvpeddcpccn 

WDAKKGE Y VFPAKJCKL QE YRVLnTLITAGR 
LVSAQFPIDHFTHIF1DEAGHCMEPESLVAIAG 
LVfEVKETGDPGGOT VT AGDPROT GPVT P<5PT 

TQKHGLGYSLLERLLTYNSLYKKGPDGYDPQ 

FITKLLRNYRSHPTILDIPNQLYYEGELQACA 

DVVDRERFCRWAG\LPRQGFPIIFHGVMGKD 

EREGNSPSFFNPEEAATVTSYLKLLLAPSSKK 

GKARLSPRSVGV1SPYRKQVEKIRYCITKLDR 

ELRGLDDIKDLKVTCCSTVTPCLPCAPTCPLP 

ETSSSFHS SPRPRPTP AALNRARALPEPLTPGD 

SNLRVWDGIRKPACLTNTSCHS 


379 


1729 


A 


3206 


432 


130 


PKAAPSVXLWrTPFL*GSFKPTKGHTXCVXIK 
♦LSTREAXDSXPGRQIAXXRQGGKVETTTAL 
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/^possible nucleotide deletion, \=possible 
nucleotide insertion 














XKQSNNKGTRASSYXEPDAXEQWKJFPHKKL 
QLPGXTHE 


380 


1730 


A 


3207 


187 


507 


OOTGHPHPARPPLSGVGGCQCSHSKPWTAGS 
PEQRDHPAPHKQEAGQGLPGPQAWGG+KGP 
AXLLPGPGGGPGFVASLEARAQASSGVTPNG 
GGRTYPYPTFSSGE 


381 


1731 


A 


3225 


1 


840 


GTRPGHLPAPSDGFCV/HL*SIPSWGSF*GESIV 

EMQLITSLGLQEFDIARNVLELIYAQTLVWIGI 

FFCPL1JPF1QMIMLFIMFYSKMSLMMNFQPPS 

KAWRASQMMTFHFLLFFPSFTGVLCTLAITI 

WIUJKPSADCGPFRGLPLFIHSIYSWIDTLSTRP 

GYLWVVWIYRNLIGSVHFFFILTLIVJLIITYLY 

WQITEGRKIMIRLLHEQIINEGKDKMFLIEKLI 

KLQDMEKKANPSSLVLERREVEQQGFLHLGE 

HDGSLDLRSRRSVQEGNPRA 


382 


1732 


A 


3238 


256 


38 


LLMIKVSSTCFSCHLHHHHHHHHRHHQGHNS 
LFFSLKSSSNSSTLPVYLSYNIILVFSKCLVFDF 
LFSNACL i 


383 


1733 


A 


3241 


1542 


343 


KGAPSFVRLYQYPNFAGPHAALANKSFFKAD 

KVTMLWNKKATAVLVTASTDVDKTGASYYG 

EQTLHYIATNGESAWQLPKNGPIYDVVWNS 

SSTEFCAVYGFMPAKATIFNLKCDPVFDFGTG 

PRNAAYYSPHGHILVLAGFGNLILQI*ADAMK 

VWNVKJOYKJUSKPVASDSTYFAWCPDGEHIL 

TATCAPRLRVNNGYKIWHYTGS1LHKYDVPS 

NAELWQVSWQPFLDGIFPAKTITYQAVPSEVP 

NEEPKVATAYRPPALRNKPITNSKLHEEEPPQ 

NMKPQSGNDKPLSKTALKNQRKHEAKKAAK 

QEARSDKSPDLAPTPAPQSTPRNTVSQSISGDP 

EIDKKIKNLKKKLKA1EQLKEQAATGKQLEK 

NQLEKIQKETALLQELEDLELGI 


384 


1734 


A 


3242 


3 


678 


IRSPAARSPGLETPTCLLFVIAAIAAVFVDSAIP 
RLTQHRPQDGSFPYTILDFPLYLPGQCAPPQP 
LSQCARRYHGEKLRRPTFGPRHRGAGTAKMS 
ASLVRATVRAVSKRKLQPTRAALTLTPSAVN 
KIKQLLKDKPEHVG VK VG VRTRG CN GLS YTL 
EYTKTKGD SDEEVIQDG VRVFIEKKAQLTLL 
GTEMDYVEDBCLSSEFVFNNPNIKGTCGCGES 
FNI 


385 


1735 


A 


3243 


3190 


664 


VAMGTPRAQHPPPPQLLFLILLSCPWIQGLPL 

KEEEILPEPGSETPTVASEALAELLHGALLRR 

GPEMGYLPGPPLGPEGGEEETTTTiriTTTVTT 

TVTSPVLCNNN1SEGEGYVESPDJLGSPVSRTL 

GLLDCTYS1HVYPGYGIEIQVQTLNLSQEEELL 

VLAGGGSPGLAPELLANSSMLGEGQVLRSPT 

NRLLLHFQSPRVPRGGGFRIHYQAYLLSCGFP 

PRPAHGDVSVTDLHPGGTATFHCDSGYQLQG 

EETLICLNGTRPSWNGETPSCMASCGGTIHNA 

TLGRIVSPEPGGAVGPNLTCRWVIEAAEGRRL 

HLHF'ERVSLDEDNDRLMVRSGGSPLSPVIYDS 

DMDDVPERGLISDAQSLYVEIXSETPANPLLL 

SLRFEAFEEDRCFAPFLAHGNVTTTDPEYRPG 

ALATF SCLPGY ALEPPGPPNAIEC VDPTEPH W 

NDTEPACKAMCGGELSEPAGWLSPDWPQS 

YSPGQDCVWGVHVQEEKRILLQVEILNVREG 

DMLTLFDGDGPSARVLAQLRGPQPRRRLLSS 

GPDLTLQFQAPPGPPNPGLGQGFVLHFKEVPR 

NDTCPELPPPEWGWRTASHGDLIRGTVLTYQ 

CEPGYELLGSDILTCQWDLSWSAAPPACQKI 



173 




WO 01/57188 PCT/US01/03800 



SEQ ID 
NO: of 
. nucl- 
eotide 
seq- 
uence 


SEQ ID 
NO: of 
peptide 
seq- 
uence 


Met 
hod 


SEQ 
ID NO: 
in 

USSN 
09/496 
914 


Predicted 

beginning 

nucleotide 

location 

correspondi 

ng to first 

uiUUlw oUU 

residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue 
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M=Methionine, N=Asparagine, P=Proline, 
Q=Ghitamine, R«Arginine, S^Serine, 

i i ill cuiiinc, v Y UiiliC, TY"i Iy piupn an, 

Y^Tyrosine, X=Unknown, + c <5top codon, 
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MTCADPGEIANGHRTASDAGFPVGSHVOYRC 
LPGYSLEGAAMLTCYSRDTGTPKWSDRVPKC 
ALKYEPCLNPGVPENGYQTLYKHHYQAGBSL 
RFFCYEGFELIGEVTITCVPGHPSQWTSQPPLC 
KVTQTTDPSRQLEGGNLALAILLPLGLVIVLG 
SGVYIYYTKLQGKSLFGFSGSHSYSPITVESDF 
SNPLYEAGDTREYEVSI 


386 


1736 


A 


3250 


5725 


3984 


GTSTVTMATKKHFSIILNLLGMLIJK^ 

RKLLNfTWALEVAVVMKKSETYAPLFCLPSF 

HKFCKGLLADTLVEDVNICLQACSSLHALSSS 

LPDDLLQRCVDVCRVQLVHRGTCIRQAFGKL 

LK5IPLGVFLSNNNHTEIQEISLALRSHMSKAP 

SNTFHPQDFSD/VISFILYGNSHRTGKDNWLE 

RLFYSCQRLDKRDQSTIPRNLLKTDAVLWQW 

AIWEAAQFTVLSKLRTPLGRAQDTFQTIEGIIR 

SLAGHTLNPDQDVSQWTTADNDEGHGNNQL 

RLVLLLQYLENLEKLMYNAYEGCANALTSPP 

KVIRTFL YTNRQTCQD WLTRIRLSIMR VGLL A 

CKJPAVTVimGFDLLTEMKTTSLSQGNELEVSI 

MMVVEALCELHCPEAIQGiAWSSSIVGKHL 

LVVINSVAQQAEGRFEKASVEYQEHLCAMTG 

VDCCISSFDKSVLTLASAGCKSASLKHCLNGE 

aKJ^VLiisr I Uborb V1N YLONKACliLYISTA 

DWAAVQEWQNAIHDLKKSTSSTSLNLKADF 

NYIKSLSSFESGKFVECTEQLELLPGENINLLA 

GGSKEKDDMKKLLRNM 


JO/ 




A 
rv 




sen 


/o 


MUlr L i JNCK I Vy 1 t,r NMQnIMA\oKJCLaKF 
LKYVHNL* AENYKTLMK* INEDLNKQRD VP Y 
S*TARLNKMSIPTKTIFRFKArVlKJPATYFIET 
NMQ 


388 


1 

1 IJO 


A 






4/0 


WLulA^ V I ALrrANr Vrr VbMKb 1 1LAQ 1 U 
FELLDS SDLPAS ASKS AGITCMSHHARTLSLK 
*WPFCLSATQEKFC*'PASEGVAW 


389 


1739 


A 


3269 


1 


332 


tDGYHTPIYMLNRIIRLP AAL* IISDQTGHALTI 
LTRLETQMINADYQNKLTLDYLLTTDREVYE 
m^ju i n i v^jumni n x ulikj y Vc» JSJ-»Ari v 
PVQV*HGFDPEAMFR 


390 


1740 


A 


3270 


2 


372 


GRCHDQNKGKS\DGPDAQAEACGGESTYQEL 
LVNQWIGQPLACRRLTRKIYEGDCKAVKFNH 
^PRnWlfVTTKTJVMKGFlffYTMVT A firm GlCiV 

YCLLPCMC*DRKLTYAHIPSTTDLGAGAGY 


391 


1741 


A 


3273 


1 


187 


FFQEMLD1MKAJSDMMGKCTYPVLKEDAPRQ 
HVETFFQ\EELTRSQEGMKLGENFLMFAMPP 
DDSKESKGK*FFQEMLDIMKAISDMMGKCTY 
r v Lir\j^uj\sr ivv^n vni. rr v uhn i^KoxvVjriJfc v jvts. 
KFPDVCHAPR 


392 


1742 


A 


3281 


901 


521 


FFFGDGVSPCRQAGV* WHDLDSLQNLPPGFK 
RFSYLSLPSSW\DYRHVLPRQANFCIF/M*RRG 
FTMLARMVSIS*PRDLPALASQSAGITGVSHH 
APPQMDFTFALLCFALKGCLPRQKEGGTLNLI 


393 


1743 


A 


3283 


385 ; 


3 


RNRSWPEFVLLGLSAGPQTQTLLFVLFWIC 
LLTVMGNLLLLVVINADSCLHTPMYFFLGQL 
SFLDLCHSSVTAPKLLENLLSEKKTISVEGCM 
A*VFFVFATGGTESSLLAVMAYDRYVAIRTR 
G 


394 


1744 


A 


3284 


575 


1054 


CTKCKADCDTCFNKNFCTKCKSGFYLHLGKC 
LDNCPEGLEANNHTMECVSIVHCEVSEWNP 
WSPCTKKGKTCGFKRGTETRVREIIQHPSAKG 
NLCPPTNETRKCTVQRKKCQKGERGKKGRE 
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/=possibIe nucleotide deletion, V=possible 
nucleotide insertion 














RKRKKFNKGESKEA1PDSKSLESSKEIPEQREN 
KQQQ 


395 


1745 


A 


3286 


1 


340 


RVLYVPSMGFC1LVAHGWQKISTKSVFKKLS 
WICLSMVlLTHSLKTFliKNWDWESEYTLFMS 
ALKVNKNNAKLWNNVGHALENEKNFERAL 
KYFLQATHVQPDDIGAHMNVGR 


396 


1746 


A 


3293 


1 


172 


GFRAWMTVKTEAAKGTLTYSRMRGMVAIL 
IAFMKQRRMGLNDFIQKIANNSYACKQ 


397 


1747 


A 


3295 


12 


401 


AEPACGASSCTPPSLRSSSSQSVGPLRPGRPL 

WSEACAFL*AAAPQGPASPCCGLPSGFPRVW 

AQCCPPGGALRFPEGLGSVLSPRRCPQVSRGS 

GLSAVPQEVPSGFLGPGLRACPQEAPSRFLRA 

GUT 




1 1AQ 

1 /4o 


A 

A 




1 010 


X/OO 


rrrxo r> \vrv\Tim? V fJPVT? VTVT A fTKrcri^WnPMrEQ 
JVl^ivxv W V^INl^lvfv-LTt tviv x 1 v iJ\KJN&K£&tri\^rSVHrj> 

MLRKLPKVTCRDVLPEIRAICIEEIGCWMQSY 

STSFLTDSYLKYIGWTLHDKHREVRVKCVKA 

LKGLYGNRDLTARLELFTGRFKDWMVSM1V 

DREYSVAVEAVRLLILILKNMEGVLMDVDCE 

SVYPIV+ASN+GLASAVGEFLYWKLFYPECEI 

RTMGGREQRQSPGAQRTFFQLLLSFFVESKSH 

SVTQAGVQWQFSAHRDLCLPGSSNSHVSASR 

VAGlAGAHRHTWLrYVFFSWRQGFAVLAGL 

VSNS 


399 


1749 


A 


3301 


536 


2391 


LRS YGCKAPSR I SHLHKNFLFLLLPSLLMG YSE 

SPPPITDSWAPF1SLTHHVLSQSQSPLSSNCWI 

CLSTHTQ*FTALPADLLTWTQSNVSLHISYLAI 

PFLADSFLKPV/L*PGNSAKHLSFKLSSLSMVS 

GRAVALLHLIASGLTSIQTNTASSKPPIWGY\L 

STQTSF1BPPPLCLSRTYPNPAHATMVGQVPQ 

SLCGLIFnyRTPCRPSILHPNYKIISTSAWQKV 

LCFSGSPTIHTSLHLTTGSSFLSFHPIPGFPAAN 

SALYVSSLKGPPGKNVTTPSPVTGT* QPPHRGS 

N/RLTVDKDNFFLSPKPNSLHQLPSQ\TPYQAL 

PSLFFLCDrN*YLCLPANWSGTCTLVFQAPTl 

NILPPNQTILISVEASISSSPIRNKWALHLITLLT 

GLGITAALGTGIAGITTSrrSYQTLFTTLSNTVE 

DMHTSITSLQRQLDFLVGVILQNWRVLDLLT 

TFKGGTCTVT.OEECCFCVNESGTVHIAVRRLH 

DRAAEL*HQVADSWWQGSSLLRWIPWVAPF 

LGPUFLFLLLMIGPCIFNLVSRPISQRLNCFIQ 

ASMQKHIDNIFHLCHV*YQSLRGNHSEAPEPR 

P 


400 


1750 


A 


3303 


2 


453 


THWRHSSGVPGSTTARRRRRELEIATSDNQE 
YYNRI/XJEVTNRERNDQKMLADIJ3DLNRTK 
KYLEERLIELLRDKDALWOKSDALEFOOKLS 
AEERWLGDTEA>THCLDCKREFSWMVRRHHC 
R1CGRIFCYY CCNNYVLSKHGGKKERCC 


401 


1751 


A 


3304 


I 


626 


MAPQHS SLDDKVPQQ ASTVCFEFQDILQHSQ 

CTEHKDSLWGPGARSQPFGAHNTRLSPDSCP 

EKIVLRALKDSRAGMPEQDKDPGVQENPDD 

QRRVPQGTGDAPSAFRPLWDNGGLSPFVSRP 

GPLERDLHAQRSEVTYNQRSQSSWMSSFPKR 

NAFVSPYSSMGQ AQP/GLPKTNPIGES CCWEG 

LSL STQILG* QKPSKYIPSLCKR 


402 


1752 


A 


3305 


1678 


172 


MELPSGPGPERLFDSHRLPGDCFLLLVLLLYA 
PYGFCLLVLRLFLGIHVFLVSCALPDSVLRRF 
WRTMCAVLGLVARQEDSQLRDHSVRVLISN 
HVTPFDHNIVNLLTTCSTVSESEAESATGRFP 
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GAQLKAPLSPLAFRMEDTEALPLTPILYPTCO 

FFFFJFLNIFLLAFSSPGSQPLLNSPPSFVCWSR 

GFMEMNGRGELVESLKRFCASTRLPPTPLLLF 

PEEEATNGREGLLRFSSWPFSIQDWQPLTLQ 

VQRTLVSVWSDASWVSEmWSI^VPFTVY 

QVRWLRPVHRQLGEANEEFALRVQQVLVAKE 

LG\QTGTRLTPA\DBCAEHMKRQRHPR\LRPQS 

AQSSFPPSPWVLSS/SDVQTGQTLGFREFKESF 

CPHYAIGVFIPERPWPKTGCCKTLTIHLILL*G 

GPVSFSCPE\DIHPRGT*VPTQQASGLPSFPSYG 

PARGGVL*HPSAQQPLTFA\KSS\WARAGRAL 

QERKQ\ALYEYARRRFTERRAPGGLD 


403 


1753 


A 


3307 


44 


447 


DPSPSLLAVALGLRAGERTRSGPGSSSPSGGIS 

GGASAGLASSPECACGRSHFTCAVSALGECT 

CIPAQWQCDGDNDCGDHSDEDGCILPTCSPL 

DFHCDNGKCIRRSWVCDSDNDCEDDSDEQD 

CPPRECEED 


404 


1754 


A 


3311 


409 


1 


PRHGWGRRVLGRDRPRLQKVKKSVKAlYIPG 
QDIIVQNEEIYARVLDKFGSNFLSRJDNADLGT 
AFVKFSTLTK*LSALLKNLLQGLSRNVIFTLDS 
LLKGDLKGVKGDLKKPFDKAWKDYETKFAK 
IEKEKREREWR 


405 


1755 


A 


3322 


12 


458 


AAVPVENPWDDPRVRPRVRIFTWEDCIAGQA 
KVLCNDSYGVTIDWSPKGAFIRLTSQSVGNG 
HPASKENDQMVDTDCNTTKVPIIWTYGDMVE 
PRPQMIRPAVGAKHKELWKILMALKKIK\IWE 
GKYTKPSQYNPNYMLELAHNDSVW 


. 406 


1756 


A 


3324 


1 


426 


LSMLST1STEHRLSVLWPIWVCCHCPTHLSAV 
MCVLLWALSLLQSILEWMFCSFLFSDVDSDN 
WCQILDFLTAVWLIFLI\LVLCGFTLVLLVRnC 
GSQKMPLTRLYVTILLTGLVFLFCSLPLSIQ+F 
LLYWIEKDLDDL 


407 


1757 


A 


3328 


213 


1841 


SGDLSPAELMMLUGDVIKQLIEAHEQGKDID 

LNKVKTKTAAKYGLSAQPRLVDDAAVPPQY 

RKVLMPKLKAKPIRTASGIAVYAVMCKPHRC 

PHISFTGNICVYCPGGPDSDFEYSTQSYTGYEP 

TSMRAIRARYDPFLQTRHRIEQLKQLGHSTO 

K VEFIVMG GTFMALPEE YRD YF1RNLHD ALS 

GHTSNNIYEAVKYSERSLTKCIGITIBTRPDYC 

MKRHLSDMLTYGCTRLEIGVQSVYEDVARD 

Tr^GHTVKAVCESFHLAKDSGFKWAHMMP 

DLPNVGLERDIEQFTEFFENPAFRPDGLKLYP 

TLVIRGTGLYELWKSGRYKSYSPSDLVELVA 

RILALVPPWTRVYRVQRDIPMPLVSSGVEHG 

NLRELALARMKDLGIQCRDVRTREVGIQEIH 

HKVRPYQVELVRRDYVANGGWETFLSYEDP 

DQDILIGLLRLRKCSEEITRFELGGGVSIVREL 

HVYGSWPVSSRDPTKFQHQGFGMLLMEEA 

ERIAREEHGSGKIAVISGVGTRNYYRKIGYRL 

QGPYMVKMLK 


408 


1758 


A 


3335 


3 


467 


AIASPRAAGIRHELTSTMAAGKNKRlTKGGK 
KGAKKKAV/DNIINIGKTLVTRTQRTKIASDG 
LKGRVFEESLADLQND\TDGYLLRVI*VAFTT 
ERTNQI/REVFNKLIPDSIGKDffiKACQSIYPLH 
DDFARKVKMLKKPKFELRKLMELHGEGSS 


409 


1759 


A 


3338 


7 


1252 


PRWRNSARDEILLSFPQNYYIQWLNGSLIHGL 
WNLASLFSNLCLFVLMPFAFFFLESEGFAGLK 
KGIRARILETLGMLLLLALLILGIVWVASALID 
NDAASMESLYDLWEFYLPYLYSCISLMGCLL 
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LIXCTTVGLNSRMFTVMGQLLVKPTELEDLDE 
QIYinXEEEALQRPTKWAVFIRW/KYNIMELE 
QELENVKTUCTKLERRKKASAWERNLVYPA 
VMVLIXIETSISVLLVAOOLCLLVDETAMPK 
GTRGPGIGNASLSTFGFVGAALEIILIFYLMVS 
SVVGFr-SLRFFGNFTPKKDDTTMTKnGNCVS 
1LVLSSALPVMSRTLGITRFDLLGDFGRFNWL 
Gr^lVLSYNLLFAIVTTLCLVRKFTSAVREE 
LFKALGLHKLHLPNTSRDSETAKPSVNGHQK 
AL 


410 


1760 


A 


3339 


127 


1433 


GSHRFSLASPLDPEVGPYCDTPTMRTLFNLL 

WLALACSPVHTTLSKSDAKiCAASKTLLEiCSQ 

FSDKPVQDRGL VV T DLKAESV VLEHRSYCSA 

KARDRHFAGD VLG YVTP WN SHGYDVTKVFG 

SKPTQISPVWLQLKRRGREMFEVTGLHDVDQ 

GWMRAVRKHAKGL\P* CLGSCLRTGLTMISG/ 

YVLDSEDEIEELSKTWQVAKNQHFDGFWE 

V WNQT .1 ,SQKRVGLIHMLTHLAEALHQARLL 

AIXVffPAnPGTDQLGMFTHKEFEQLAPVLD 

GFSLMTYDYSTAHQPGPNAPLSWVRACVQV 

LDPKSKWRSKILLGLNFYGMDYATSKDAREP 

WGARYIQTLKDHRPRMVWDSQVSEHFFEY 

KKSRSGRHVVFYPTLKSLQVRLELARELGVG 

VSI WELGQGLD YFYDLL* VGIAASA VD VFFSK 

PWSE 


411 


1761 


A 


3342 


74 


2701 


VATRKLAKGFTQFAKMTEGTKKTSKKFKFFK 

FKGFGSFSNLPRSFTLRRSSASISRQSHLEPDTF 

EATQDDMVTVPKSPPAYARSSDMYSHMGTM 

PRPSIKKAQNSQAARQAQEAGPKFNLVPGGV 

PDPPGLEAAKEVMVKATGPLEDTPAMEPNPS 

AVEVDPIRKPEVPTGDVEEERPPRDVHSERAA 

GEPEAGSDYVKFSKEKYILDSSPEKLHKELEE 

ELKLSSTDLRSHAWYHGRIPREVSETLVQRN 

GDFL1RD SLTSLGDY VLTCRWRNQALHFKIN 

KVWKAGESYTHIQYLFEQESFDHVPALVRY 

HVGSRKAVSEQSGAITYCP WRTFPLRYLEAS 

YGLGQGSSKPASPVSPSGPKGSHMKRRSVTM 

TDGLTADKVTRSDGCPTSTSLPRPRDSIRSCA 

LSMDQIPDLHSPMSP1SESPSSPAYSTVTRVHA 

APAAPSATALPASPVARRSSEPQLCPGSAPKT 

HGESDKGFHTSPSHTLGKASPSPSLSSYSDPDS 

GHYCQLQPPVRGSREWAATETSSQQARSYGE 

RLKELSENGAPEGDWGKTFTVPIVEVTSSFNP 

ATFQSLLIPRDNRPLEVGLLRKVKELLAEVDA 

RTLARHVTKVDCLVARILGVTKEMQTLMGV 

RWGMELLTLPHG\RKLRLDLLERFHTMSIML 

AVDILGCTGSAEERAALLHKT1QLAAELRGT 

MGNMFSFAAVMGALDMAQISRLEQTWVTLR 

QRHTEGAIL YEKKLKPELKSLNEGKEGPPL SN 

TTFPHVLPLrTLLECDSAPPEGPEPWGSTEHGV 

EVVLAHLEAARTVAHHGGLYHTNAEVKLQG 

FQARPELLEVFSTEFQMRLLWG SQGAS SSQA 

RRYEKFDKVLTALSHKLEPAVRSSEL 


412 


1762 


A 


3347 


1 


898 


- IDRAAECRTKPLPMAVSIRGNADSIVACLVLM 
\^YLIKKRLVACAAVFYGFAVHMKIYPETYI 
LPITLHLLPDRDNDK.SLRQFR YTFQ ACL* ELL 
KRLCNRTALMFVAVAGLTFFALSFGFYYEYG 
WEFIJEHTYFYHLTRRDIRHNFSPYFYMLYLT 
AESKWSFSLGIAAFLPQLILLSAVSFAYYRDL 
VFCWFTJiTSlFVTTNKVCTSQYFLWYLCLLPL 
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vmrlvrmpwkrA Vyi \ijn .vtexhh AJUWt_Ab 
A YVLEFQ GKNTFLH WLAGLFFLLINCSILIQII 
SHYKEEPLTERIKYD 


413 


1763 


A 


3361 


3 


474 


PIPVRWNSLEGRLLRGYEQHANDGKDYISRN 
♦DLRSWTAADMAAQITKRKWEAEEFAEQIKA 
YLEGTCVER/LRTHLENGKETLQLTEQSSQPTI 
PIVGIVAGLVLLGAWTGAVVSAVMCRKKNS 
GHFLPTDRVSYSEAASSDHAQGSDVSLTACK 
V 


414 


1764 


A 


3363 


1488 


453 


hqilelkkkilktynpdydedl vqeas sedvl 

gvhmvdkdterdiemkrqlrrlrelhlyst 

wkkyqeamktslgvpqrerdegslgkplcp 

PtlLSETLPGSVKKRVCFPSEDHLEEFlAEHLP 
EASNQSLLTVAHADAGTQTNODLEDLEEHOP 
GQTVSEEATEVHMMEGDPDTLAELLIRDVLQ 
ELSSYNGEEE\DPEEVKTSLGVPQRGDLEDLE 
EHWGQWSEEATGVHMMQVDPATLAKSDL 
EDLEEHVPEQTVSEEATGVHMMQVDPATLA 
KQLEDSTITGSHQQMSASPSSAPAEEATEKTK 
VbEEVKTRKPKKKTRKPSKKSRWNVLKCWD 
IFNIF 


415 


1765 


A 


3369 


431 


315 


IPWSWVGRLSVRKMSELF*LTYNYNAILNKTP 
PSFSPSL 


416 


1766 


A 


3373 


42 


651 


RQEKMGLGE1GASGVLRSMLKERKKQNMKG 
NGNVTLTPLLPAVQCGCHLQPAGRSPLPSSHS 
APGLCSPLHPLQPQQEASTCPSGTLQGREKAA 
PGQGRPLCSLWAGGAGA\PGERGAEGRGPSD 
QAPDPKSGPWLFPPGLGAPAEVRLHNVPHNL 
RRPPLP* ARGK*PPNSGCPWSEGRAKQPLS CG 
PKPQCSLPSQVPGDTH 


417 


1767 


A 


3382 


2 


2061 


EAQDPRACGPDAGGRFAARDAPGNSLRPPPS 

SPP/GWPGQLRLLPRVPGSELRCGKPERGRLP 

ASPPGKIRGWPPGISKRPGLGGRSFPPGFAPRT 

WRPEARGPSVQSLPPIFSPQSAQTTAR*RPGAP 

KNAGRCGGA\RGPRLSLGPPPGPPPAPALPAR 

ASAGAGAAAAALAVGGVRGAGGARGTGGY 

GHCSGR/PTGRTGPGPQGPGPPMPARPR*A$\S 

TRGSRRGPGSRPARAAAAPRAGDHGRRPVRV 

HLRQHTA V* EPRL GDAT APPGG AAGPG AP AP 

R\GPGWDCALLPSPGPRSPRAVGCAEPEIWDP 

SPRRGTSPVPSVRSLRSEPANPRLGLPALLNSY 

PLKGPGLPFPwGPRTQTGHVUTVQPSGSCIEH 

SKSLD/RGPWGAPPWGPSSSGLCSPKLATAGP 

PQSWGLCQIGRRRGLGGPGLKRGET/GLL*GC 

SMDHANRTKGPGVPTSNRCFSHIPG\GDGCSD 

HSSCEGPffDLHAGREMPAAPGLSELERVRFT 

VGCGGLASGISSASVSGLSPNRAGGPGQGDW 

EMYPVSWQTQESGGQG/SPKTGR*VGMLQA 

GAGSLQGGTGDGVWGLWEDGP/RG*DSPLPS 

GTGTEP*TPTTSIPFFPQPSGVYPSRATLLPMPS 

Y*ALGPSANKSEKPLLSFLYRGLCCRISLQLA 

KGIGQLSEIPLLNVETAFWSMWVTYFRK 


418 


1768 


A 


3398 


304 


2121 


EEEEEEEDEDDDDNNEEEEFECYPPGMKYQV 
RYGRGKNQKMYEASKDSDVEGGEVLYLVH 
YCGWNVRYDEWiXM)KrVRPADKNVPKIKH 
RXKIKNKLDKEKDKDEKYSPKNCKPPALGPN 
PPFQTNPISWKWYPKLDLTDAKNSDTAHIKSI 
EITSILNGLQASESSAEDSEQEDERGAQDMDN 
NGKEESKIDHLTNNRNDLISKEEQNSSSLLEE 
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SEQ ID 
NO: of 
nucl- 
eotide 
seq- 
uence 


SEQ ID 
NO: of 
peptide 
seq- 
uence 


Met 
hod 


CPA 

SEQ 
ID NO: 
in 

USSN 
09/496 
914 


Predicted 

beginning 

nucleotide 

location 

correspondi 

ng to first 

amino acid 

residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue 
of peptide 
sequence 


AmmO oClu 5C*|UGIlvG V-*»- l u * v \s vrjrdldJJ^j 

D=Aspartic Acid, EKJlutamic Acid, 
F=Phenylalanine, (XHycine, H=Histidine, 
I=Isoleucine, K=Lysine, L=Leucine, 
M=Methionine, N=Asparagine, P=Proline, 
Q=Glutamine, R^Arginine, &=Serine, 
T«Threonine, V=Valine, W»Tryptophan, 
Y^Tyrosine, X=Unknown, *=Stop codon, 
/^possible nucleotide deletion, \=possible 
nucleotide insertion 














^VfiADLVISKPVSKSPERLRKDIEVLSEDTD 
YEEDEVTKKRKDVKKDTTOKS SKPQIKRGKR 
RYCNTEECLKTGSPGKKEEKAKNKESLCMEN 

SLRTTGFYSGFSEVAEKRJKLLNNSDERLQNS 
RAKDRKDVWSSIQGQWPKKTLKELFSDSDTE 
AAA <?PPHP APEFGVAEESLOTVAEBESCSPSV 
ELEKPPPVNVDSKPIEEKTVEVNDRKAEFPSS 
GSNFS A* IPLPYLHLNRLHQSL* QKGSRQQSS 
VTVSEPLAPNQEE VRSIKSETDSTIEVD S V AGE 

KSLYRSEKSERCSGRRKFIKKAEKKP* SNSGK 
QQKEGK 


419 


1769 


A 


3399 


206 


463 


QRECLSIHIGQAGIQIGDACWELYCLEHGIQP 
NGVVLDTQQDQLENAKMEHTNASFDTFFCE 
TRAGKHVPRALFVDLEPTVIDGIR 


420 


1770 


A 


3408 


1010 


685 


RRLSFFF* 1WSSVLVTQARVQWRDLGSPQPLP 
PGFKRFSCLSLPSSWDYRHPSPRPVNF/HVFLV 
VMGFHHVGQAGLELLTSGDLPALASQSARIT 
GVNHCAQPRGHFH 


421 


1771 


A 


3409 


355 


1326 


ADSNLDESCWQELGLGPWGGDWRVEQVGAS 

LYLVSPLENEPKEMLTLSEYHERVRSQGQQL 
QQLQAELDKLHKEVSTVRAANSERVAKLVF 
QRLNEDFVRKPDYALSSVGASIDLQKTSHDY 
ADRNTAYFWNRFSFWNYARPPTVILEPHVFP 

PPSVEHTGGANSAPRDFAVFFLLSFFTHQGLQ 
VYDETEVSLGKFTFDVEKSEIQTFHLQNDPPA 
AFPKVKIQILSNWGHPRFTCLYRVRAHGVRT 
SEGAEGSAQGPH 


422 


1772 


A 


34L2 


2 


421 


EFDAQPSIGALWFKRP*ATTGSDPGPKRGMN 

YLVSCSMRSPESGKGEPGTARDYTPMGRPPP 

PVPWSPGPLPGSLAIAPHSPEPHPWEOOPPRG 

QARSPPGGWLGSAT/RVRRPHNHP/RGH/HSP 

VDTAGAPASPGPDVCE 


423 


1773 


A 


3420 


91 


706 


DAQRAIYSSVGPAVSLRQRQQDGAVKESGR/ 
RGGVRSFSRAAAAN1APIKYGDAIPAVEVFEG 
EPGNKVNLAELFKGKKGVLFGVPGAF1PGCS 
KTHLPGFVEQAEALKAKGVQVVACLSVNDA 
FVTGEWGRAHKAEGKVRLLADPTGAFGKET 
DLLLDDSLVSIFGNRRLKRFSMVVQDGIVKA 
LNVEPDGTGLTCSLAPNIISQL 


424 


1774 


A 


3421 


4 


7688 


RQVTRVGTRVLGSTT AA.VFL S VEDDNDN AP Q 

FSEKRYWQVREDVTPGAPVLRVTASDRDKG 

SNAWHYSIMSGNARGQFYLDAQTGALDVV 

SPLDYETTKEYTLRVRAQDGGRPPLSNVSGL 

VTVQVLDINDNAPIFVSTPFQATVLESVPLGY 

LVLHVQAIDADAGDNARLEYRLAGVGHDFP 

FITNNGTG Wl S V AAELDREE VDFYSFG VEAR 

DHGTPALTASASVSVTAJLDVNDNNPTFTQPE 

YTVRLNEDAAVGTSWTVSAVDRDAHSV1TY 

QITSGNTRNRFS1TSQSGGGLVSLALPLDYKLE 

RQYVLAWASDGTRQDTAQIVVNVTOANTH 

RPVFQSSHYTVNVNEDRPAGTTVVLISATDE 

DTGENARTTYFMEDSIPQFRIDADTGAVTTQA 

ELDYEDQVSYTLATTARDNGIPQKSDTTYLEI 

LVNDVNDNAPQFLRDSYQGSVYEDVPPFTSV 

LQISATDRDSGLNGRVFYTFQOGDDGDGDFI 
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S13QID 
NO: of 
nucl- 
eotide 

seq- 
uence 


SEQID 
NO: of 
peptide 
seq- 
uence 


Met 
hod 


m 

ID NO: 
in 

USSN 
09/496 
914 


Predicted 

beginning 

nucleotide 

location 

correspondi 

ng to first 

amino acid 

residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue 
of peptide 
sequence 


Amino acid sequence (A«Alanine OCysteine, 
D»Aspartic Acid, rXMutamic Acid, 
^Phenylalanine, G=Glycine, H=Histidine, 
I=Jsoleucine, K=Lysine, LNLeucine, 
M=Methionine J N=Asparaginc, P=Proline, 
Q=Glutamine, R^Arginine, S«Serme, 
'Whreonine, V=Valine, W^Tryptophan, 
Y^Tyrosine, X-Unknown, *= 3 Stop codon, 
/possible nucleotide deletion, \=possible 
nucleotide insertion 














VESTSGIWTtkRLDRENVAOYVLRAYAVDKl 

GMPPARTPMEVTVTVLDVNDNPPVFEQDEFD 

VFVEENSPIGLAVARVTATDPDEGTNAQIMY 

QIVEGNffEVFQLDIFSGELTALVDLDYEDRPE 

YVLVIQATSAPLVSRATVHVRLLDRNDNPPV 

LGNFEtLFNNYVTNRSSSFPGGAIGRVPAHDP 

DISDSLTYSFERGNELSLVLLNASTGELKLSR 

ALDNNRPLEAIMSVLVSDGVHSVTAQCALRV 

TUTDEMLIKSITLRLEDMSPERFLSPLLGLFIQ 

AVAATLATPPDHVVVFNVQRDTDAPGGHILN 

VSLSVGQPPGPGGGPPFLPSEDLQERLYLNRS 

LLTAISAQRVLPFDDN1CLREPCENYMRCVSV 

LRFDSSAPFIASS SVLFRPIHPVGGLRCRCPPGF 

TGDYCETEVDLCYSRPCGPHGRCRSREGGYT 

CLCRDGYTGEHCEVSARSGRCTPGVCKNGGT 

CVNLLVGGFKCDCPSGDFEKPYCQVTTRSFP 

AHSFITFRGLRQRFHFTLALSFATKERDGLLL 

YNGRFNEKHDFVALEVIQEQVQLTFSAGEST 

TTVSPFVPGGVSDGQWHTVQLKYYNKPLLG 

QTGLPQGPSEQKVAVVTVDGCDTGVALRFGS 

VLGNYSCAA\QGTQGGSKKSLDLTGPLLLGG 

VPDLPESFPVRMRQFVGCMRNLQYDSRHIDM 

ADFIANNGTVPGCPAKKNVCDSKTCHNGGTC 

VNQWDAFSCECPLGFGGKSCAQEMANPQHF 

LGS SLVAWHGLSLPISQPWYLSLMFRTRQAD 

GVLLQATrRGRSTITLQLREGHVMLSVEGTGL 

QASSLRLEPGRANDGDWHHAQLALGAIGGP 

GHAILSFDYGQQRAEGNLGPRLHGLHLSNTTV 

GGIPGPAGGVARGFRGCLQGVRVSDTPEGVN 

SLDPSHGESINVEQGCSLPDPCDSNPCPANSY 

CSNDWDSYSCSCDPGYYGDNCTNVCDLNPC 

EHQSVCTRKPSAPHGYTCECPPNYLGPYCET 

RIDQPCPRGWWGHPTCGPCNCDVSKGFDPDC 

NKTSGECHCKENHYRPPGSPTCLLCDCYPTG 

SLSRVCDPEDGQCPCKPGV1GRQCDRCDNPF 

AEVTTNG CE VNYDS CPRAIE AGIW WPRTRFG 

LPAAAPCPKGSFGTAVRHGDEHRGWLPPNLF 

NCTSITFSELKGFAERLQRNESGLDSGRSQQL 

ALLLRNATQHTAGWGSDVKVAYQLATRLL 

AHESTQRGFGLSATQDVHFTENLLRVGSALL 

DTANKRHWEUQQTEGGTAWLLQHYEAYAS 

ALAQNMRHTYLSPFTAT-PNIVISVVRLDKGN 

FAGAKLPRYEALRGEQPPDLETTVILPESVFR 

ETPPVVRPAGPGEAQEPEELARRQRRHPELSQ 

GEAVASVIIYRTLAGLLPHNYDPDKRSLRVPK 

RPIINTPWSISVHDDEELLPRALDKPVTVQFR 

LLETEERTKPICVFWNHSILVSGTGGWSARGC 

EVWRNESHVSCQCMTMTSFAVLMDVSRRE 

NGEILPLKTLTYVALGVTLAALLLTFFFLTLL 

RILRSNQHGIRRNLTAALGLAQLVFLLGINQA 

DLPFACTVIAILLHFLYLCTFSWALLEALHLY 

RALTEVRDVNTGPMRFYYMLGWGVPAFTTG 

LAVGLDPEGYGNPDFCWLSrYDTLIWSFAGP 

VAFAVSMSVFLYILAARASCAAQRQGFEKKG 

PVSGLQPSFAVLLLLSATWLLALLSVNSDTLL 

FHYLFATCNCIQGPFIFLSYVVLSKEVRKALK 

LACSRKPSPDPALTTKSTLTSSYNCPSPYADG 

RLYQP\YGDSAGSLHSTSRSGKSQPSYIPFLLR 

EESALNPGVQGPPGLGGIPGR/LCFLGRFKDQQ 

H\DS*TRDFDSDLSLEDDQSGSYASTHSSDSEE 
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SEQ ID 
NO: of 
nucl- 
eotide 
seq- 
uence 


SEQ. ID 
NO: of 
peptide 
seq- 
uence 


Met 
hod 


SEQ 
ID NO: 
in 

USSN 
09/496 
914 


Predicted 

beginning 

nucleotide 

location 

correspond! 

ng to first 

amino acid 

residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue 
of peptide 
sequence 


Amino acid sequence (A=Alanine OCystcine, 
D-Aspartic Acid, E-Glutamic Acid, 
^Phenylalanine, G=Glydne, H^Histldine, 
I=Isoleucme, K=Lysine, LHLeucine, 
M=Methionine, N^Asparagine, PHProIine, 
Q=Glutamine,R=Arginine> S=Serine, 
T=Threonine, V«Valine, W=*Tryptophan, 
Y«Tyrosine, X«Unknown, *^top codon, 
/=possible nucleotide deletion, \=possible 
nucleotide insertion 














EEEEEEEEAAFPGEQGWDSLLGPGAERLPLHS 
TPKDGGPGPGKAPWPGDFGTTAKESSGNGAP 
EERLRENGDALSREGSLGPLPGSSAQPHKGIL 
KKKCLPTISEKSSLLRLPLEQCTGSSRGSSASE 
GSRGGPPSRPPPRQSLQEQLNGVMPIAMSIKA 
GTVDEDSSGSEFLFFNFLH 


425 


1775 


A 


3429 


155 


1417 


GEPAVQSCDCGCTQRSCPWIXVAPGLLSSSSS 
RAASVREAEDAPLQPASIHPVSQGSRGPEGSL 
GSAECLPGDPLGARRATRAHSPVPGPPPSLPA 
AGTAVKRGLQPG*GA/GATSTPGTGAATGGL 
CGPAWAAPSAVGPCCCCPSISTTPSQMRSARP 
SLGCLPSWAS\PGTEHPPGPQGPGPS*DLCSV* 
KREFQRGPWAGMVTLHRISAADPARAPGPDS 
NLQSALQQPATGCSEPAAVYSPPIGLWGA**P 
EYG*PQHSLPG*TAPADR*P\AGIKDRVYSNSI 
YELLENGQRAGTCVLEYATPLQTLFAMSQYS 
QAGFSREDRLEQAKLFCRTLEDILADAPESQN 
NCRLIAYQEPADDSSFSLSQEVLRHLRQEEKE 
EVTVGSLKTSAVPSTSTMSQEPELLISGMEBCP 
LPLRTDFS 


426 


1776 


A 


3431 


1662 


369 


AIWWLSWLQHDLLPTPTQVAIDFTASNGDPR 

SSQSLHCLSPRQPNHYLQALRAVGGICQDYD/ 

SVGESGAGGNRQGGLAQRIPQLFLLPSDKRFP 

AFGFGARIPPNFEVG*MRGKEGDGGRVSQAE 

JCAGPHCSRLALTGXSHDFAINFDPENPECEGK 

RGDFHLPRLPADTLHTGAQTPLPRAQLPVPST 

HPRPVFI\EISGVIASYRRCLPQIQLYGPTNVAP 

IINRVAEPAQREQSTGQATKYSVLLVLTDGV 

VSDMAETRTATVRASRLPMSHIVGVGNADFS 

DMRLLDGDDGPLRCPRGVPAARDIVQFVPFR 

DFKD VSPPGPFRLKDS S ASHPPKSDLRLPPFD 

VUJITREPSWPP*SPTSPSDDPASPTLPLTPNHI 

TVPTL\AAPSAL AKC VLAE VPRQ VVE Y Y AS Q 

GISPGAPRPCTLATTPSPSP 


427 


1777 


A 


3446 . 


79 


9748 


GCQSCWPAWPRLRRRGPASAGARLGRKAPW 

GLPGRVQDGRPLRFCFYLRPRAPFIAPVLSGA 

ASRPEASGDCRAGRETAMATLEKLMKAFESL 

KSFQQQQQQQQQQQQQQQQQQQQQQQPPPP 

PPPPFPPQLPQPPFQAQPLLPQPQPPFPPPPPPP 

GPAVAEEPLHRPKKELSATKKDRVNHCLTIC 

EN1VAQSVRNSPEFQKLLGIAMELFLLCSDDA 

ESDVRMVADECLNKVDCALMDSNLPRLQLEL 

YKEIKKNGAPRSLRAALWRFAELAHLVRPQK 

CRPYLVNLLPCLTRTSKRPEESVQETLAAAVP 

KIMASFGNFANDNEIKVLLKAFIANLKSSSPH 

RRTAAGS AVSICQHSRRTQYFYS WLLN VLLG 

LLVPVEDEHSTLLDLGVLLTLRYLVPLLQQQV 

KDTSLKGSFGVTRKEMEVSPSAEQLVQVYEL 

TLHHTQHQDHNWTGALELLQQLFRTPPPEL 

LQTLTAVGGIGQLTAAKEESGGRSRSGSIVELI 

AAj v» LJ o oLOr V 1jo1\JvI^I\AJ1V Y l^LAJ&EJlJ\Xjn,UUii 

ESRSDVSSSALTASVKDEISGBLAASSGVSTPG 

SAGHDIITEQPRSQHTLQADSVDLASCDLTSS 

ATDGDEEDBLSHSSSQVSAVPSDPAMDLNDG 

TQASSPISDSSQTTTEGPDSAVTPSDSSEIVLD 

GTDNQYLGLQIGQPQDEDEEATGILPDEASEA 

FRNSSMALQQAHLLKNMSHCRQPSDSSVDKF 

VLRDEATEPGDQENKPCRIKGDIGQSTDDDS 

APLraCVRLLSASFLLTGGKNVLVPDRDVRV 

SVKALALSCVGAAVALHPESFFSKLYKVPLD 
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SEQID 
NO: of 
nucl- 
eotide 
seq- 
uence 


SEQID 
NO: of 
peptide 
seq- 
uence 


Met 
hod. 


SEQ 
ID NO: 
in 

USSN 
09/496 
914 


Predicted 

beginning 

nucleotide 

location 

correspondi 

ng to first 

amino acid 

residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue 
of peptide 
sequence 


Amino acid sequence (A^Aianinc OCysteine, 
D-Aspartic Acid, E=OIutamic Acid, 
F^Pbenylalanine, G=Glycfoe, H=Hlstidine, 
I=IsoIeucine, K=Lysine, L=Leucine, 
M^Methionine, NeAsparagine, P«Proline, 
Q=Giutamine, R=Arginine f S=Serine, 
^Threonine, V=»Valine, W=Tryptophan, 
Y=Tyrosine, XHJnknown, *=Stop codon, 
/^possible nucleotide deletion, V=possible 
nucleotide insertion 














'! \iiyj>eeqyvsdii^idhgdpqVrgaYaJLC 

gtlicsilsrsrfhvgdwmgtotltgf^ 

adcipllrktlkdessvtcklactavrncvm 

slcsssyselglqlhdvltlrnssywlvrtel 

letlaerofrlvsfleakaenlhrgahhytgl 

lklqervujnvvihllgdedprvrhvaaasl 

irlvpklfykcdqgqadpwavardqssvyl 

kllmhetqpps1 ifs vstitriyrg ynllpsitd 

vtmennlsrviaayshelrrsttraltfgcce 

alcllstafpvciwslgwhcgvpplsasdesr 

ksctvgmatmiltllssawfpldlsahqdal 

ilagnllaas apkslrs s waseeeanpaatk 

qeevwpalgdralvpmveqlfshllkvinic 

ahvlddvapgpaikaalpsltnppslspirrk 

gkekepgeqasvplspkkgseasaasrqsdts 

gpvttsks sslgsfyhlpsylklhdvlkatha 

nykvtldlqnstekfggflrsaldvlsqilel 

atlqdigkcveeilgylkscfsrepmmatvc 

vqqllktlfgtnlasqfdglssnpsksqgra 

qrlgsssvrpglyhycfmapythftqalada 

slrnmvqaeqendtsgwfdvlqkvstqlkt 

nltsvtknradknaihnhirlfeplvikalkq 

yttttcv qlqkq vldllaql vqlrvnycll 

DSDQVFIGFVLKQFEYIEVGQFRESEAIIPNIFF 

FLVLLSYERYHSKQDGIPKIIQLCDGIMASGR 

BCAVTHAIPALQPIVHDLFVLRGTNKADAGKE 

LETQKEVVVSMLLRLIQYHQVLEMFILVLQQ 

a-IKENEDKWKRLSRQIADIILPMLAKQQMHI 

DSHEALGVLNTLFEILAPSSLRPVDMLLRSMF 

VTPNTMASVSTVQLWISGILAILRVLISQSTED 

IVLSRIQELSFSPYLISCTV1NRLRDGDSTSTLE 

EHSEGKQIKNLPEETFSRFLLQLVGILLEDIVT 

KQLKVEMSEQQHTFYCQELGTIXMCLIHIFKS 

GMFRRTTAAATRLFRSDGCGGSFYTLDSLNLR 

ARSMITTHPALVLLWCQIIXLVNHTDYRWW 

AEVQQTPKRHSLSSTKXLSPQMSGEEEDSDLA 

AKLGMCNREIVRRGAL1LFCDYVCQNLHDSE 

HLTWLIVNfflQDLISLSHEPPVQDFISAVHRNS 

AASGLFIQAIQSRCENLSTPIMLKKTLQCLEGI 

HLSQSGAVLTLYVDRLLCTPFRVLARMVDIL 

ACRRVEMLL AANLQ S SMAQLPMEELNRIQEY 

LQSSGLAQRHQRLYSLLDRFRLSTMQDSLSPS 

PPVSSHPLDGDGHVSLETVSPDKDWYVHLVK 

SQCWTRSDSALLEGAELVNRIPAEDMNAFM 

MNSEFNLSLLAPCLSLGMSEISGGQKSALFEA 

AREVTLARVSGTVQQLPAVHHVFQPELPAEP 

AAYWSKLNDLFGDAALYQSLPTLARALAQY 

LVWSKLPSHLHLPPEKEKDIVKFWATLEAL 

SWHLIHEQIPLSLDLQAGLDCCCLALQLPGL 

WSWSSTEFVTHACS1 lYCVUFTT PAVAVOPri 

EQLLSPERRTNTPKAISEEEEEVDPNTQNPKYI 

TAACEMVAEMVESLQSVLALGHKRNSGVPA 

FLTPLLRNniSLARLPLYNSYTRVPPLVWKLG 

WSPKPGGDFGTAFPEIPVEFLQEKEVFKEFIYR 

INTLGWTSRTQFEETWATLLGVLVTQPLVME 

QEESPPEEDTERTQINVLAVQAITSLVLSAMT 

WVAGNPAVSCLEQQPRNKPLKALDTRFGRK 

LSIIRGIVEQEIQAMVSKRENIATHHLYQAWD 

PVPSLSPATTGALISHEKLLLQINPERELGSMS 

YKLGQVSIHSVWLGNSITPLREEEWDEEEEEE 
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SEQ ID 
NO: of 
nucl- 
eotide 
seq- 
uence 


SEQ ID 

NO: of, 
peptide 
seq- 
uence 


Met 
bod 


ID NO: 
in 

USSN 
09/496 
914 


Predicted 

beginning 

nucleotide 

location 

correspond^ 

ng to first 

amino acid 

residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue 
of peptide 
sequence 


nJIlinO BX/JU SCiJUGtlwC \/\ rYlaJUIIC vy SSfr V/jr olCUI C, i 

D=Aspartic Acid, E=Glutamic Acid, 
F*=Phenylalanine, G=Glycine, H=Histidine, 
I=Isoleucine, K=Lysine, L=Leucine, 
M=Methionlne, N^Asparagine, P=Proline, 
Q=Glutamine, R«Argjnine, S^Serine, 
r-Threonine, V^Valine, W^Tryptophan, 
Y=Tyrosine, X=Unknown, +=Stop codon, 
/^possible nucleotide deletion, \=possible 
nucleotide insertion 














ADAPAPSSPPTSPVNSRKHRAGVDIHSCSQFL 

LELYSRWILPSSSARRTPAILISEVVRSLLVVS 

DLFTERNQFELMYVTLTELRRVHPSEDEILAQ 

YLVPATCKAAAVLGMDKAVAEPVSRLLESTL 

RSSHLPSRVGALHGILYVLECDLLDDTAKQLI 

P VI SD YLLSNLKGI AHCVNIH SQQHVLVMCAT 

AirVT TCKIVDI n\7rJPPPQAdTTnN4PfTVX>TT Qncjp 
f\r i LfUCfiN x JrJLly v vjrrtT ofVollv^mv^vj v iylLljvjoci 

ESTPSIIYHCALRGLERLLLSEQLSRLDAESLV 

KLSVDRVNVHSPHRMIAALGLMLTCMYTG 

KEKVSPGRTSDPNPAAPDSESVIVAMERVSVL 

FDRIRKGFPCEARWARILPQFLDDFFPPQDIM 

INNVl^J&rl-iOlNVfx-r * r\£rwu\i v v uvvi* v^iLno 

TGQSSMVPO)WVMLSLS^JFTQRAPVAMATWS 

LSCFFVSASTSPWVAAILPHVISRMGKLEQVD 

VNLFCLVATDFYRHQIEEELDRRAFQSVLEV 

VAAPGSPYHRLLTCLRNVHKVTTC 


428 


1778 


A 


3449 


3 


430 


NSRPSPSAALVEVLLRSGSTFPHTVSGGWAA 
WGPWSSCSRDCELGFRVRKRTCTOPEPRNGG 

WSPCSASCGGGHYQRTRSCTSPAPSPGEDICL 
GLHTEEALCATQACPEGWS 


429 


1779 


A 


3464 


583 


3 


DALDRRYLEkCHPAAGGWVGEGE*ALCQKTT 

RFSGVLEPP1PSLKDGGRFPAWT*RSCSKSLR 

AAFTSQFFPSRRSRASPGSAPNGNGQNLTEQHP 

CPGSCDPQVLSASWM*VEHRSKFRPPP*NSTI 

PPES/RS* QGGTVQTGQHSSGREAGSWRARGR 

NAGRR+KGGGKIGTKQGAVRARKECRGEMA 

SGETDSE 


430 


1780 


A 


3473 


2802 


270 


FRMRIFLHCPWNQQMWKIWNLLETSLESCKA 

HLSIQKLLKER\Q\QLPVFKHRDSIYETLKRHR 

WWAGETNGSGKSTQVPHFLLEDLLLNEWE 

ASKCMVCTQPRRISAVSLANRVCDELGCENG 

PGGRNSLCGYQBRMESRACESTR3XYCTTGV 

LLRKLQEDGLLSNVS/HMFIVDEVVHERVSVQS 

DFLLIILKEILQKRSDLHLILMSATVDSEKFST 

YFTHCPILRISGRSYPVEVFHLEDIIEETGFVLE 

KDSEYCQKJLEEEEEVTINVTSKAGGIKKYQE 

Y IP VQTGAHADLNPF YQKYSSRTQHAIL YMN 

PHKJnUDLILELLAYLDKJSPQFRNIEGAVLIFL 

PGLAfflQQLYDLLSNDRRFYSERYKVTALHSI 

LSTQDQAAAFTLPPPGVRFOVLATNIAETGlTr 

PDWFVIDTGRTKENKYHESSQMSSLVETFVS 

KASALQRQGRAGRVPJ)GFCFRMYTRERFEG 

FMDYSVPEILRVPLEELCLHIMKCNLGSPEDF 

LSKALDPPQLQVISNAMNLLRKIGACELNEPK 

VATLAAVMTEKSPFTTPIGRKDEADLAKSAL 

AMADSDHLTIYNAYLGWKKARQEGGYRSEI 

TYC1WJFLNRTSLLTLEDVKQELIKLVKAAGF 

SSSTTSTSWEGNRASQTLSFQEIALLKAVLVA 

GLYDNVGKnYTKSVDVTEKLACIVETAOGK 

AQVHPSSVNRDLQTHGWLLYQEKIRYARVY 

1JRETTL1TPFPVLLFGGDIEVQHRERLLSIDGW 

IYFQAPVKIAViraQLRVLIDSVLRKKLENPK 

MSLENDKILQIITELIKTENN 


431 


1781 


A 


3474 


1 


441 


FRP APGH VQP * GGSS AAAGGGLLSHPRPCQQ 

PCPPAPAPSRPRSLGSLGQRVPAALATAAQEL 

PATLGGDGGKPALTAGEAALPGLHRSGVPAA 

AARC*PCT/SRPT*STLSPTQAAWWCRPSRRQ 

QRGEASTGGASGRRCGSCFQV 
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sea- 
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cnrreQnondi 




V/f— \yf pf h ir\n t-n a W — A moranino T) — Pmlina 

ivi— jvicuiiuiuiic, ivp=/\i>panigme, x = r roiins, 


uence 






914 


ng to first 
amino acid 
residue of 
peptide 
sequence 


acid residue 
of peptide 
sequence 


QKShitamine, R=Arginine, S-Serine, 
T=Threonine, V^Valine, W»Tryptophan, . 
Y=Tyrosine, X=Unknown, *=Stop codon, 
/s^ossible nucleotide deletion, possible 
nucleotide insertion 


432 


1782 


A 


3478 


41* 


23 


OLmTU>Nl^TV r /YS$*JMAWH**iaWQm 

QWRRESPEIDL<XYS*LSFDK[EAKAIK/WKE 

CSLFr^W(mNWM/UWQKK]U*VQTLHPS 

QKLKVSKWIKDLNVECRITKLLDQEYPGDLGY 

SRALNSGSR 


433 


1783 


A 


3504 


1876 


552 


clapcspqpekngmqplllllppLLyqOllhS 
slgapgestllvrtskllvglolqllvwlll 
qtrsllalqlhltssapllaaptavcscsrcs 

APRSRCVARPAARTGLPTPAPASSPAPAASPA 

PAASPAPAESTANPQPLILLPKP/PPAPGAPPPRP 

GAPPPRPAASPSPAASPAPPAASPVLTASPPLP 

AASPSPAASPAPPAASPVLTASPPLPAASPSPA 

ASPAPPAASPVLTASPPLPAASPALAASPVHT 

ASPPVHVASPPVHTASPPVHVASPPVHTASPP 

VHVASPPVHTASPHVHVASPPVHVASPPVHV 

ASPPVHTASPPVHVASPPVHTASPHVHVASPP 

VHTASPPVHVASPPVHVASPPVHVAYPPVHV 

ASPPVHVASPPVHVASPPVSCSGDSTSDCFPP 

QPG AVFPHSLAPSLGG WSHLVAALP 


434 


1784 


A 


3516 


142 


590 


GGVNRPRSETEQVKTPVLISSWDYRHPPPRPA 
SFFmv*TGP\TAlJUlMVLISWPCDLPTSASQ 
SAGITGVRHHA\RLLYFEQESHSVTQAGW\VQ 
WHNLGSLQPLSLEDRLSPGVLGCSALCRSGV 
RTKFGINMVTSRERGTTRLPKEG I 


435 


1785 


A 


3529 


1 


3161 


MSLVRAALEALDELDLFGVKGGPQSVIHVLA 

DEVQHCQSILNSLLPRASTSKEVDASLLSWS 

FPAFAVEDSQLVELTKQEIITKjLQGRYGCCRF 

LRDGYKTPKEDPNRLYY/ENPAELKLFENIEC 

EWPLFWTYFILDGVFSGNAEQVQEYKEALEA 

VLDCGKNGVPLLPELYSVPPDRVDEEYQNPHT 

VDRVPMGKLPHMWGQSLYILGSLMAEGFLA 

PGEIDPLNRRFSTVPKPDVYVQVYPSLPHGCS 

SKSPSHQCTIISIRTTRJCITAPVSILAETEEIKTIL 

KDKGIYVETIAE VYPIRVQPARJDLSHI YS SLEIF 

LPFLNSVSGCNNRMKLSGRPYRHMGVLGTSK 

LYDIRKTIFTFTPQFIDQQQFYLALDNKMIVE 

MLRTDLSYLCSRWRMTGQPTITFPISHSMLDE 

DGTSLNSSILAALRKMQDGYFGGARVQTGKL 

SEFLTTSCCTHLSFMDPGPEGKLYSEDYDDN 

YDYLESGNWMNDYDSTSHARCGDEVARYL 

DHLLAHTAPHPKLAPTSQKGGLDRFQAAVQT 

TCDLMSLVTKAKELHVQNVHMYLPTKLFQA 

SRPSFNLLDSPHPRQENQVPSVRVEIHLPRDQ 

SGEVDFKALVLQLKETSSLQEQAD&YMLYT 

MKGPDWNTELYNERSATVRELLTELYGKVG 

EIRHWGLIRYISGILRKKVEAXDEACTDLLSH 

QKHLTVGLPPEPREKTISAPLPYEALTQLIDEA 

obtjDMblMLl QE1MVYLAMYMRTQPGLFAE 

MFRLRIGLIIQVMATELAHSLRCSAEEATEGL 

MNLSPSAMKNLLHHILSGKEFGVERSVRPTD 

SNVSPAISIHEIGAVGATKTERTGIMQLKSEIK 

QSPGTSMTPSSGSFPSAYDQQSSKDSRQGQW 

QRRRRLDGALNRVPVGFYQKVWKVLQKCH 

GI^VEGFVLPSSTTREMTPGEIKFSVHVESVL 

NRWQPEYRQLLVEAIL\VLTMLADIEI\HSIGS 

I1AVEK1VHIANDLFLQEQKTLGADDTMLAKD 

PASGICTLLYDSAPSGRFGTMTYLSKAAATY 

VQEFLPHSICAMQ 


436 


1786 


A 


3546 


73 


393 


CP*LTWELLEVKKAEVLQDSLDGRYSTPSSCL 
EQPDSCRPYGRSFYALEEKHVIFSLDVGETDN 
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nucleotide insertion 














KGKGKTIRGI*TFKGRKGGTY QREHDANPLA 
PXSARSCWMRKG 


437 


1787 


A 


3554 


5157 


2939 


AVRAEPGLEELSSGLRAHSPSATTVCEPEAQG 

SASGCRYAAHPHWGLGGAAAAGGSWEPQPP 

RPVCEPAGRGKPHPPAAPRSPLLPGSRRRPHA 

AQPGARARTSPPPASARNMAARPAATLAWSL 

LLLSSALLREGCRARFVAERDSEDDGEEPWF 

PESPLQSPTVLVAVLARNAAHTLPHFLGCLER 

LDYPKSRMAIWAATDHNVDNTTEIFREWLK 

NVQRLYHYVEWRPMDEPESYPDEIGPKHWP 

TSRPAHVMKLRQAALRTAREKWSDYILFIDV 

DNFLTNPQTLNLLIAENKTIVAPMLESRGLYS 

NFWCGITPKGFYKRTPDYWQIREWKRTGCFP 

VPMVHSTFLIDLRKEASDKLTFYPPHQDYTW 

TFDDIIVF AFS SRQAGIQMYLCNREHYG YLPIP 

LKPHQTLQEDIENLIHVQIEAMIDRPPMEPSQ 

YVSVWKYPDKMGFDEIFMINLKKRKGQGGD 

RWLRTLYEQEIEVBOVEAVDGKALNTSQLKA 

LNIENILPGYRDPYSSRPLTRGEIGCFLSHYSV 

WK3EVIDRELEKTLVIEDDVRFEHQFKICKLMK 

LMDNIDQAQLDWELIYIGRKRMQVKEPEKA 

VFNVANLVEADYSYWTLGYVISLJi^ 

GANPFGKMLPVDEFLPVMYNKHPVAEYKEY 

YESRDLKAFSAEPLLIYPTHYTGQPGYLSDTE 

TSTIWDNETVATDWDRT11AW1CSRKQSRIYSN 

AKNTEALPPPTSLDTVPSRDEL 


438 


1788 


A 


3563 


130 


527 


IFFNSSSLFCRVFCLFLRWSFTLVAQARVQ*C 
NLSSLQPLPPGFK*FSCLSPPRS*DYRRPPPRPA 
NFLYF* *RQGFTVLGQAGLELLT/S/GDPPTSA 
SQSAGITGVSHRAWPVHA1STHISLVKTRPSLT 
TLG 


439 


1789 


A 


3565 


446 


1834 


LLQPAMRKSPGLSDCLWAWILLLSTLTGRSY 

GQPSLQDELKDNTTVFTRILDRLLDGYDNRL 

RPGLGERVTEVKTDIFVTSFGPVSDHDMEYTI 

DVFFRQSWKDERLKFKGPMTVLRLNNLMAS 

KIWTPDTFFHNGKKSVAHNMTMPNKLLRITE 

UU 1 LJLY 1 MKL i VKVAJiCrMAr uKJL>rrM\U\AJnl 

ACPLKFGSYAYTRAEVVYEWTREPARSVVV 

AEDGSRLNQYDLLGQTVDSGIVQSSTGEYW 

MTTHFHLKRK1GYFVIQTYLPCIMTVILSQVSF 

WLrlNtxEiO Vr f\M\L VruV J- 1 YJullVll I L»oLor\tvlNoJL» 

PKVAYATAMDWFTAVCYAFVFSALIEFATVN 
YFTKRGYAWDGKSVVPEKPKKVKDPLIKKN 
NTYAPTATSYTPNLARGDPGLATIAKSATIEP 
KEVKPETKPPEPKKTFNSVSKIDRLSRIAFPLL 
FGIFNLVYWATYLNREPQLKAPTPHQ 


440 


1790 


A 


3568 


1 


350 


STSSCFPAAAAAIMREIVHLQAGQCGNQIGAK 
FWEVISDEHGIDPTGTYHGDSDLQLERINVYY 
NEATGE AP VPSFTALRGPRGPCLG* RPP VP AG 
frV YVP"R A VT VDMFPfrTMDW 


441 


1791 


A 


3569 


2 


1751 


FVAVAGAVSGEPLVHWCTQQLRKTFGLDVS 

EEOQYVLSIESAEEIREYVTDLLQGNEGKKGQ 

FIEEHTKWQKNDQELISDPLQQCFKKDE1LDG 

QKSGDHLKRGRKKGRNRQEVPAFTEPDTTAE 

VKTPFDLAKAQENSNSVKKKTKFVNLYTREG 

QDRLAVLLPGRHPCDCLGQKHKLINNCLICG 

RIVCEQEGSGPCLFCGTLVCTHEEQDILRGDS 

N\KSQKLLKKLMSGVENSGKVDISTKDLLPH 

QELRIKSGLEKAIKHKDKLLEFDRTSIRRTQYI 
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ELRHASRLSKKVTIDFAGRKILEEENSLAEYH 

SRLDET1QAIANGTLNQPLTKLDRSSEEPLOVL 

VNPNMYQSPPQWVDHTQAASQKKAFRSSGF 

GLEFNSFQHQLRIQDQEFQEGFDGGWCLSVH 

QPWASLLVRGKRVEGRSWYTPHRGRLWIAA 

TAKKPSPQEVSELQATYRLLRGKDVEFPNDY 

PSGCLLGCVDLIDCLSQKQFKEQFPDISQESDS 

PFVFICKNPQEMVVKFPIKGNPKIWKLDSKIH 

QGAKKGLMKQNKAV 


442 


1792 


A 


3576 


1 


2019 


MPRSHTGERLCEGKEGSQCAENFSPNLSVTK 

ktagvkpyecticgkafmrlssltrhmrsht 

airai\ekpykckec\grafslsqilsic\nersh 

tgekpykckqcgktfiyhqpfqrhertmgek 

pyeckqcgkalsc s s slrvherihtgekp yec 

kqcgkafscsssirvherihtgebcpyack:\ec 

gkafis\ttsvlthmithngdrpykckecgka 

fifpsflrvherihtgekpykckqcgkafrws 

tsiqiherihtgekpykckecgksfsarpafrv 

hvrvhtgekpykckecgkafsrisyfrihert 

n 1 OxlKx i cLrUs.LOK 1 r N i rLDLKIHKKNHTG 
EKPYECKECAKTFISLENFRRHMITHTGDGPY 

CGKAFSCSSYIRIHKRTHTGEK\PYECKECGK 

AFIYPTSFQGHMRMHTGEKPYKCKECGKAFS 

LHSSFR\RHTRIHNYEKPLEC* Q\CGKAFS VSTS 

LKKPMRNAQSDRKLY/KCEK*EKVFNSNRCF 

QSCENSH*REKSCQCK* YRKRDTR* FMYSQ V 

PHNHVSVSNGPYR/CGSPIRLYNT*NIS[NRNL 

VAVVTP*CSTLFKCLWCWCKRAALSW*/IVQ 

DSGRGRWLTPVIPALWEAKAGGSRGQEIKTIL 

ANTVKPHLY 


443 


1793 


A 


3578 


287 


114 


DFYERKFEQFDEGHKQIVNKWRDLLCSWKRK 
LSIIKKSVLQNNL*FSAASMRFQKVFF 


444 


1794 


A 


3582 


3335 


1909 

! 


HLFFSLFLAAMAMTGSTPCSSMSNHTKERVT 
MTKVTLBNFYSNLIAQHEEREMRQKKLEKV 
MEEEGLKDEEKRLRRSAHARKETEFLRLKRT 
RLGLEDFESLKVIGRGAFGEVRLVQKKDTGH 
VYAMKILRKADMT,FKFOVnHTR AFRDTI VP A 

DSLWVVKMFYSFQDKLNLYLIMEFLPGGDM 
MTLLMKKOTLTEEETQFYIAETVLAIDSIHQL 
GFIHRDIKPDNLLLDSKGHVKLSDFGLCTGLK 
KAHRTEF YRNLNHSLP SDFTTQNMNSKRKAE 
TWKRNRRQLAFSTVGTPDYIAPEVFMQTGYN 
KLCDWWSLGVIMYEMLIGYPPFCSETPQETY 
KKVMNWKETLTFPPEVPISEKAKDLILRFCCE 
WEHRIGAPGVEEIKSNSFFEGVDWEHIRERPA 
AISIEIKSIDDTSNFDEFPESDILKPTVATSNHPE 
TDYKNKDWVTINYTYKRFEGLTARGAIPSYM 
KAAK 


445 


1795 


A 


3584 


1 


6169 


RTRGIEKRFAYSFLQQLIRYVDEAHQYILEFD 

GGSRGKGEHFPYEQEKFFAKVVLPLIDQYFK 

NHRLYFLSAASRPLCSGGHASNKEKEMVTSL 

FCKLGVLVRHRISLFGNDATSIVNCLHILGQT 

LDARTVMKTGLBSVKSALRAFLDNAAEDLE 

KTN1ENLKQGQFTHTRNQPKGVTQ1INYTTVA 

LLPMLSSLFEHIGQHQFGEDLILEDVQVSCYRI 

LTSLYALGTSKSIYVERQRSALGECLAAFAGA 

FPVAFLETHLDKHNIYSIYNTKSSRERAALSLP 

TNVEDVCPNIPSLEKLMEEIVELAESGIRYTQ | 
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Y=Tyrosine, X=Unknown, *=Stop codon, 
/=possible nucleotide deletion, V=possible 
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MPHVMEVILPMLCS YM SRW WEHGPENNPER 

AEMCCTAU^SEHMNTLLGNILKIIYNNLGIDE 

GAWMKRLAVFSQPIINKVKJ'QLLJCIHFLPLM 

EKLKKKAATVVSEEDHLKAEARGDMSEAEL 

ULDEFTTLARDLYAPYPLLIRFGDYNRAKWL 

KEPWEAEELFRMVAEVF1YWSKSHNFKREE 

QNFWQNEINNMSPLITDTKSKMSKAAVSDQ 

ERKJCMKRKGDRYSMQTSLIVAAUCRl^ 

NICAPGDQEUALAKNRFSLKDTEDEVRDI1RS 

NIHLQGKLEDPA1RWQMALYKDLPNRTDDTS 

DPEKTVERVLDIANVLFHLEQKSKRVGRRHY 

CLVEHPQRSKKAVWHKLLSKQRKRAVVACF 

RMAPLYNLPRHRAVNLFLQGYEKSWIETEEH 

YFEDKLIEDLAKPGAEPPEEDEGTKRVDPLHQ 

LILLFSRTALTEKCKLEEDFLYMAYADIMAKS 

CHDEEDDDGEEEVKSFEEKEMEKQKLLYQQ 

ARLHDRGAAEMVLQTISASKGETGPMVAAT 

LKLGIAILNGGNSTV QQKMLD YLKEKKDVGF 

FQSLAGLMQSCSVLDLNAFERQNKAEGLGM 

VTEEGSGEKVLQDEHEFTCDLFRFLQLLCEGH 

NSDFQNYLRTQTGNNTTVN1IISTVDYLLRVQ 

ESISDFYWYYSGKDVIDEQGQRNFSKA1QVA 

K.QVFNTLTEYIQGPCTGNQQSLAHSRLWDAV 

VGFLHVFAHMQMKLSQDSSQIELLKELMDLQ 

KDMWMLL SMLEGNV VNGTIGKQMVDML V 

ESSNNVEN1ILKFFDMFLKLKDLTSSDTFKEYD 

PDGKGVIFKRDFHKAMESHKHYTQSETEFLL 

SCAETDENETLDYEEFVKRFHpPAKDIGFNVA 

VLLTNLSEHMPNDTRLQTFLELAESVLNYFQP 

FLGRIEIMG S AKRIERV YFEI SES SRTQ WEKPQ 

VK£SKRQFIFDVVNEGGEKEICMELFVNFCED 

TIFEMQLAAQISESDLNERSANKEESEKERPEE 

QGPRMAFFSILTVRSALFALRYNILTLMRMLS 

LKSLKKQMKKVKKMTVKDMVTAFFSSYWSI 

FMTLLHFVASVFRGFFRUCSLLLGGSLVEGA 

KKIKVAELLANMPDPTQDEVRGDGEEGERKP 

LEAALPSEDLTDLKELTEESDLLSDIFGLDLKR 

EGGQYKLIPHNPNAGLSDLMSNPVPMPEVQE 

KFQEQKAKEEEKEEKEETKSEPEKAEGEDGE 

KEEKAKEDKGKQKLRQLHTHRYGEPEVPESA 

FWKKIIAYQQKLLKyFAJRNFYNMRMLALFV 

AF AINFUJLF YKVSTS S V VEGKELPTRS S SENA 

KVTSLDSSSHRIIAVHYVLEESSGYMEPTVRIL 

PILHTVISFFCIIGYYCLKVPLVIFKREKEVARK 

LEFUGLYITEQPSEDDlKGQWDRLVlNrQSFP 

NNYWDKFVKRKVMDKYGEFYGRDRISELLG 

MDKAALDFSDAREKKKPKKDSSLSAVLNSID 

VKYQMWKLGWFTDNSFLYLAWYMTMSVL 

GHYWJFFFAAHLLDIAMGFKTLRTILSSVTH 

NGKQLVLTVGLLAWVYLYTWAFNFFRKF 

YNK.SEDGDrPDh4KCDDMLTCYMFHMYVGV 

RAGGGIGDEIEDPAGDEYEIYRIIFDITFFFFVI 

V1LLAIIQGLIIDAFGELRDQQEQVKEDMETKC 

FICGIG>uOYFDTVPHGFETrra>QEHNLANYLF 

FLMYLINKDETEHTGQESYVWKMYQERCWE 

FFPAGDCFRKQYEDQLN 


446 


1796 


A 


3592 


1 


355 


AGLELLNSDDPPALASQSAGITGVTRTPSLFF* 
DTVLLCCSGWSAVAPSRLTAALFS*AQAVCL 
SLPRSWDYRRW/PPHPANFCIFCRDE/SLA/ML 
PRLVSNSWTQA1LLPRPPKMLGLQV 
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/^possible nucleotide deletion, \=possibIe 
nucleotide insertion 


447 


1797 


A 


3598 


1202 


1070 


LFVGGGP1CPEGASGFAPGPAPAPRVGVDAEV 

GR*V*GAAASQGA/GSLRPRPTGPGHPGAWL 

QVWGAAAVCAGPAM*/AVRAKRGPRAG*EP 

NSPWRSGVLAA\RAVGAGPWP*P*PGCS*ARG 

PSSRSAPGLASGPAAPLLQGVHSSAGPLLCYI 

NGTLALGLKP**AWGWGEWRPKG 


448 


1798 


A 


3604 


3115 


557 


FRRKGOGGPKDFGAGLKYNSRHEKVNGLEE 

GVEFLPVNNVXKVEKHGPGRWVVLAAVLIG 

LLLVLLGIGFLVWHLQYRDVRVQKVFNGYM 

RITNENFVDAYENSNSTEFVSLASKVKDALKL 

LYSGVPFIGPYHKESAVTAFSEGSVIAYYWSE 

FSIPQHLVEEAERVMAEERWMLPPRARSLKS 

FWTSWAFPTDSKTVQRTQDNSCSFGLHAR 

GVELMRFTTPGFPDSPYPAHARCQWALRGD 

ADSVLSLTFRSFDLASCDERGRHLV\TVYNT\L 

SPMEPHA\LVQLCGTYPPSYNLTFHS\S\QNVL 

LITLITNTERRHPG\FEATFFQLPRMSSCGGRL 

RKAQGTFNSPYYPGI-IYPPNIDCTWNIEVPNN 

QHVKVRFKFFYLLEPGVPAGTCPKDYVEING 

EKYCGERSQFWTSNSNKITVRFHSDQSYTDT 

GFLAEYLSYDSSDPCPGQFTCRTGRCIRKELR 

CDGWADCTDHSDELNCSCDAGHQFTCKNKF 

CKPLFWVCDSLNDCGDNSDEQGCSCP\AQTF 

RCSNGKCLSKSQQCNGKDDCGDGSDEASCP 

KVNWTCTKHTYRCLNGLCLSKGNPECDGK 

EDCSDGSDEKDCDCGLRSFTRQARWGGTD 

ADEGEWPWQVSLHALGQGHICGASLISPNWL 

VSAAHGYTDDRGFRYSDPTQWTAFLGLHDQS 

QRSAPGVQERRLKRIISHPFFNDFTFDYDIALL 

ELEKPAEYSSMVRPICLPDASHVFPAGKAIWV 

TGWGHTQYGGTGALILQKGEIRVINQTTCEN 

LLPQQITPRMMCVGFLSGGVDSCQGDSGGPL 

SSVEADGRIFQAGWSWGDGCAQRNKPGVY 

TRLPLFRDWIKENTGV 


449 


1799 


A 


3618 


2 


613 


FVSGSPWRMDGSTERLEARRPAGRLPWSSRQ 
EMTRRPSLMAGRQHGWSAQQSATVANPVPG 
ANPDLLPHFLGEPEDVYIVKNKPVLLVCKAV 
PATQIFFKCNGEWVRQVDHVEERSTDGSSGLP 
TMEVRINVSRQQVEKVFGLEEYWCQCVAWS 
SSGTTKSQKAYIRIAYLRKNFEQEPLAKEVSL 
EQGIVLPCRPPEGIPPAE 


450 


1800 


A 


3620 


1 


2676 


MEPSLGQGMDLTCPFGVSPACGAQASWSIFG 

ADAAEVPGTRGHSQQEAAMPHIPEDEEPPGE 

PQA AQSPAGQQGPPTAGVSCSPTPTIVLTGDA 

TSPEGETOKNLANRVHSPHKRLSHRHLKVST 

ASLTSVDPAGH1IDLVNDQLPDISISEEDKKKN 

LALLEEAKLVSERFLTRRGRKSRSSPGDSPSA 

VSPNLSPSASPTSSRSNSLTVPTPPEGDEADVS 

SPHPGEPNVPKGLADRKQNDQRKVSQGRLAP 

KPPPVEKSKElAIEQKENFDPl.,QYPETTPKGLA 

PVTNSSGKMALNSPQPGPVESELGKQLLKTG 

WEGSPLPRSPTQDAAGVGPPASQGRGPAGEP 

MGPEAGSKAELPPTVSRPPLLRGLSWDSGPEE 

PGPRLQKYLAKLPLAEEEKRFAGKAGGKLAK 

APGLKDFQrQVQPVRMQKLTKLREEHILMRN 

QNLVGLKLPDLSEAAEQEJCGLPSELSPAIEEE 

ESKSGLDVMPNISDVLLRKLRVHRSLPGSAPP 

LTEKEVENVFVQLSSAFRKDSYTLESRINQAE 

RERNLTEENTEKELENFKASITSSASLWHHCE 

HRETYQKLLEDIAVLHRLAARLSSRAEWGA 
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VRQEKRM SKATE VMMQY VENLKKT Y JtiKDH 

AELMEFKXLANQNSSRSCGPSEDGVLRTARS 

MSLTLGKNMPRKRVSVAVVPKFNALNLPGQ 

TPSSSSIPSLPALSESPNGKGSLPVTSALPALLE 

NGKTNGDPDCEASAPALTLSCLEELSQETKA 

RMEEEAYSKGFQEGLKKTKELQDLKEEEEEQ 

KSESPEEPEE VEETEEEEKDPRS SKJLEEL VHFL 

QVMYPKLCQHWQYIWMMAAVMLVLTVVL 

GLYNSYNSCAEQADGPLGRSTCSAAQKDSW 

WSSGLQHEQPTEQ 


451 


1801 


A 


3623 


504 


198 


QLIQHQfTVHTGRKLYECKECGKAFNQGSTLI 
RHQRIHTGEKPYECKVCGKAFRVSSQLKQHQ 
RIHTGERPYQCKELKGRGAEMLAVLAVKEQ 
NRTPVNYGK 


452 


1802 


A 


3628 


2 


195 


MTCLHS AKAFHY* SSCSFSCEEGFALIGPE VV 

QCTALGVWTAPAPVCIAVQCQHLEALNEGT 

MG*DYPFTAFAYGSSCKYECHTVYRVRGLD 

MLHSRGCYLWNGHFTT*EAISCEPLERPCH*S 

V*CSFSCEEGFALIGPEWQCTALGVWTAPAP 

VCIAVQCQHLEALNEGTMG 


453 


1803 


A 


3637 


662 


142 


IQAKGLGIWHVPNKSPMQHWRMCGSLLRYRT 

DTGFLQTLGI-INLLGIYQKYPVKYGEGKCWT 

DNGPVIPWYDFGDAQKTASYYSPYGQREFT 

AGFVQFRVFNNERAANALCAGMRVTGCNTE 

HHCIGGGGYFPEASPQQCGDFSGFDWSGYGT 

\HVGYSSSREITE\AAVLLFYR 


454 


1804 


A 


3641 


1 


362 


TQVHPAMLGLDELGRSGCGHCTQADLRFGD 
AAGRDPGQDNDRNTAEPAFPPPPRVMAAAA 
ALRAPAQSSVTFEDVAVNFSLEEWSLLNEAQ 
GCLYHDVMLETLTLlSStGKVLILNCDLS 


455 


1805 


A 


3646 


2 


414 


AAAGRGASGALTGEGGGEQGRRVGLGSRAH 

SLLLGPTFNSCQVSSQPPRVAGLGLPLKHEPS 

RPQPPSPRGPRTVRAGVPGAHPQDTPCPEFVR 

PRKVPLVGEAPGLPPEERSRGWRRDTPGLQE 

SRVRAPSYDD1T 


456 


1806 


A 


3656 


396 


8 


QIVSFNSYLTLYTKNbn^KSMKDLNVNTEMIK 
LLELKNIHNLG*AOFLN*IQKALIKRKILIHW 
P/LIKIK/SFCSLSDTIKICMKRQTIVWEQTFira 
SVKELVSRIYEAFLQFNKTVNRPVFDIKKEQK 
F 


457 


1807 


A 


3660 


14 


1961 


SEAKLGGPTGMDLWQLLLTLALAGSSDAFSG 
SEATAAILSRAPWSLQSVNPGLKTNS SKEPKF 
TKCRSPERETFSCHWTDEVHHGTKNLGPIQLF 
YTRRNTQEWTQEWKECPDYVSAGENSCYFN 
SSFTSIWYCIKLTSNGGTVDEKCFSVDEIVQ 
PDPPIALN^TLLNVSLTGIHADIQVRWEAPRN 
ADIQKGWMVLEYELQYKEVNETKWKMMDP 
ILTTSVPVYSLKVDKEYEVRVRSKQRNSGNY 
GEFSEVLYVTLPQMSQFTCEEDFYFPWLLIIIF 

KGIDPDLLKEGKLEEVNTILAIHDSYKPEFHS 

DDSWVEFIELDIDEPDEKTEESDTDRLLSSDH 

EKLHINLGVKDGDSGRTSCCEPD1LETDFNAH 

DIHEGTSEVAQPQRLKGEADLLCLDQKNQNN 

SPYHDACPATQQPSVIQAEKNKPQPLPTEGAE 

STHQAAHIQLSNPSSLSNIPFYAQVSDITPAGS 

WLSPGQKNKAGMSQCDMHPEMVSLCQENF 

LMDNAYFCEADAKKCIPVAPHIKVESHIQPNS 

LNQEDIYITTESLT\TAAGSF\GTGEHVPGSEM 
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PVPnyT^TUTVOgpQr.T.n.xFATA iPLpDKHFL^ 
SCGYVSTDQLNKIMP 


458 


1S08 


A 


3663 


154 


462 


TRAPASGRSGAGLALSANAPDSGGHPGATEG 
PAG SLAHASGS ARGTWRVRGRGSHG WERTV 
GAGGCANPVPALHSCASAPRGTGRVSALGPK 
TGSSPLSSPKG 


459 


1809 


A 


3664 


902 


135 


LGKYNTSMALFDFVLHNSTGEIRYITEDDVIQ 

SQNALGKYNTSMALFESNSFEKTILESPYYVD 

LNQTLFVQVSLHTSDPNLWF1DTCRASPTSD 

FASPTYDLIKSGCSRDETCKWYPLFGHYGRF 

QFNAFKFLRSMSSVYLQCKVLICDSSDHQSRC 

\NQGCVSRSKRDISSYKWKTDSIIGPIRLKRDR 

SAVNONSGFQHETHAEETPNQPFNSVHLFSFM 

VLALNVVTVATITVRHFVNQRADYQ\YQKLQ 

NY 


460 


1810 


A 


3670 


850 


557 


LGILMSPQVEAGEI*ALLTPPPGCMQFSPLTL/P 

K*WVSPGLTP/PPPEVPSVFLVEPGLPHAGQA 

GLDLUTSGDPPASTSQSARTTDVSHRAQPLAI 

.S 


461 


1811 


A 


3671 


2472 


2099 


IGVLAFETGSCSVTRLYCIGIIMPHCSLDLAGSN 
TSAFRIAGTTSVHHHPQLTFFFFWIETGSHCV 
VQTGL*LLALSNPPALASQ1AGISGMSHRAWP 
GLVLYSLEFSLLCASQSUMLFTCYNE 


462 


1812 


A 


3672 


394 


no 


VKPVNGESKRD+ GADTQTCEGEADEQLQ1\N 
CYYD/STKSFFYISCG+K\RKPTWAENRRLNA 
KMFGIPLHSNSDPWGYEEREVIGFHRSRVSRG 
HGS 


463 


1813 


A 


3673 


348 


i 


QRNPFSAGHPQRPPTSGSQSELLAQPRLRPGR 
KSSFSRDQDVW* SQAVPKRQ* QRNPFSAGHP 
QRPPTSGSQSELLAQPRLRPGRKSSFSRDQDV 
WPGQKPRPSQQQHQMCASPTLGQRSPFALEP 
VPAYHGGRDPFASARPSPVGIPKPRAAPAGG 
GWRRIRPKSSTK 


464 


1814 


A 


3676 


2253 


320 


PVIQRCSQPYGFSLLISFFUCCVSETSQQPPSR 

KVFQLLPSFPTLTRSKSHESQLGNRIDDVSSM 

RFDLSHGSPQMVRRDIGLSVTHRFSTKSWLS 

QVCHVCQKSMIFGVKCXHCRIJKCHNKCTKE 

APACRISFLPLTRLRRTESVPSDINNPVDRAAE 

PHFGTLPKALTKKEHPPAMNHLDSSSNPSSTT 

FS1TSSPAPFPTSSNPSSATTPP\NPSP\GQR\DSR 

FNFPSCVAYFIHHR\Q\QFIFPDISAFAHAAPLPE 

AADGTRLDDQPKADVLEAHEAEAEEPEAGK 

SEAEDDEDEVDDLPSSRRPWRGPISRKASQTS 

VYLQEWDIPFEQVELGEP1GQGRWGRVHRGR 

WHGEVArRLI^MDGHNQDHLKLFKKEVMN 

YRQTRHENWLFMGACMNPPHLAIITSFCKG 

RTLHSFVRDPKTSU)INKTRQIAQEIIKGMGY 

LHAKGIVHKDLKSRNVFYDNG\KWITDFGLF 

\GISGWP\EGRRENQLKLSHDWLCYLAPEIVR 

EMTPGKDEDQLPFSKAADVYAFGTVWYELQ 

ARDWPLKNQAAEASIWQ1GSGEGMKRVLTS 

VSLGKEV8ENLSACWAFDLQERPS\FSLLMD 

MLEKLPKLNRRLSHPGHF*KSADINSSKWFR 

FERFGLGVLESSNPKM 


465 


1815 


A 


3679 




803 


IPSPAWWNSTWADTFSLLLALAVALYLGYY 

WACVLQTHRAFCASNTED1JETVVNHIKHRYP 

QAPLLAVGISFGGILVLNHLAQARQAAGLVA 

ALTLSACWDSFETTRSLETPLNSLLFNQPLTA 

GLCQLVERLSY/E*DLQARTIRQFDERYTSVA 
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FGYQDCVTYYKAASPRTKIDAIRIPVLYLSAA 
DDPFSTVCALPKQAAQHSPYVALLITARGGHI 
GFLEGLLP WQHW YMSKLLHQY AKA1FQDPE 
GLPDLRALLPSEDRNS 


466 


1816 


A 


3684 


3 


307 


SSQYTVQSKTKIFL*AAREKQ/RHTCRRFSIRLS 
AN1SSQTGEARGQWPSVFKVLKEKKLSTKKS 
FGQK*GR\RKTFPDKQK/LREFDTTRFnQEML 
TGVLQG 


467 


1817 


A 

• 


3687 


2465 


837 


ELPTPL1AAHQLYNYVADHASSYHMKPLRMA 

RPGGPEHNEYALVSAWHSSGSYLDSEGLRHQ 

DDFDVSLLVCHCAAPFEBQGEAERHVLRLQF 

FVVLTSQRELFPRLTADMRKFRKPPRLPPEPE 

APGSSAGSPGEASGLILAPGPAPLFPPLAAEVG 

MARARLAQLVRLAGGHCRRDTLWKRLFLXE 

PPGPDRLRLGGRLALAELEELLEAVHAKSIGD 

IDPQLDCFLSMTVSWYQSLIKVLLSRFPQSCR 

HFQSPDLGTQYLVVLNQKFTDCFVLVFLDSH 

I.GKTSI.TWFREPFPVQPQDSESPPAQI ,VSTY 

HHLES VINT ACFTL WTRLL* GSGLDH* MSLFL 

ESWAYQ1ACQRQD*PALLGPRASQTLSDTKG 

FVTMS*GSAAPAWQQEPPSPNTHSH*PIQDSR 

ESGQPRGPLGPFWGTPFGPPGRVSGVHTGWQ 

TPPRAPLPESCPL\PLTTVSHLCPLSLRVFTSHL 

DITAGHSHRDDTWVPIPALPLKHLRPPSSPFA" 

LGPWVSHPLMRWVQKLSHLHSNPGTGFSMG 

GKQQRN 


468 


1818 


A 


3691 


960 


499 


QTCRKDKRAI YPHFQNE* MNEIKAI* SGTGG1 

QCFHSQNDSAFFFFLFLLETEFCSAA/TVQWH 

DFLSMQPPPPGFKQFTCLSLLSSWNYRR\PPPF 

PGNF\*FLVKTGFPHVGQTGFELLTSSDLAPLA 

SQNGGITGMSPCAWPFFFFFFFGLC 


469 


1819 


A 


3714 


4747 


495 


MAYSWQTDPNPNESHEKQYEHQEFLFVNQP 

HSSSQVSLGFDQ1VDEISGKIPHYESEEDENTFF 

VPTAPK WD STGHSLNE AHQISLNEFTSKSREL 

SWHQVSKAPAIGFSPSVLPKPQNTNKECSWG 

SPIGKHHGADDSRFSDLAPSFTSLDKINLEBCEL 

ENENHNYHIGFESSIPPTNSSFSSDFMPKEENK 

RSGHVNIVEPSL\fLLKGSLQPGMWESTWQK 

>^SIGCSIQLVEWQSSNTSLASF(m.VKKIR 

ERYHAADVNFNSGKI W STTTAFPYQLFSKTK 

FNIHIF1DNSTQPLHFMPCANYLVKDLIAEILH 

FCTNDQLLPKDHELSVWGSEEFLQNDHCLGS 

HKMFQKDKSVIQLHLQKSREAPGKLSRKHEE 

DHSQFYLNQLLEFMHIWKVSRQCLLTLIRKY 

DFHLKYLLKTQENVYNnEEVKKICSVLGCVE 

TKQITDAVNELSLILQRKGENFYQSSETSAKG 

LIEKVTTELSTSIYQLINVYCNSFYADFQPVNV 

PRCTSYLNPGLPSHLSFTVYAAHNIPETWVHR 

INFPLEIKSLPRESMLTVBXFGIACATNNANLL 

AWTCLPLFPKEKSILGSMLFSMTLQSEPPVEM 

rTPGVWDVSQPSPVTLQDDFPATGWEYMKPD 

SEENRSNLEEPLKECIKHIARLSQKQTPLLLSE 

EKKRYLWFYRFYCNNENCSLPLVLGSAPGW 

DERTVSEMHT1LRRWTFSQPLEALGLLTS SFP 

DQEIRKVAVQQLDNLLNDELLEYLPQLVQAV 

KFEWNLESPLVQIXLHRSLQSIQVAHRLYWL 

LKNAENEAYFKSWYQKLLAALQFCAGKALN 

DEFSKEQKLIKILGDIGERV^SASDHQRQEVL 

KKEIGRLEEFFQDVNTCHLPLNPALOKGIDH 

DACSYFTSNALPLKITFlNANLMGKNISnFKA 
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nucleotide insertion 














QDDIJJ^DMLVLQUQVMDNIWLQEQLDMQ 

MIIYRCLSTCB^QRLVQMV^ 

SGUGPLKENTKKWFSQHNHLKADYEKALR 

NFFYSCAGWCVVTFILGVCD1UJNDNIMLTKS 

GHMFHIDFGKFLGHAQTFGGIKRDRAPFIFTS 

EM\EYFITEGG\KNPQHFQDFV\ELCCRAYNIIR 

KHSQLLL\NLL\EMMLYAGULPELSGI\QDLKY 

VYNNLRPQDTDLEATSHFTKKIKESLECFPVK 

LNNLIHTLAQMSAISPAKSTSQTFPQESCLLST 

TRSIERATILGFSKKSSNLYLIQVTHSNNETSL 

TEKSFEQFSKLHSQLQKQFASLTLPEFPHWW 

HLPFTNSDHRRFRDLNHYMEQILNVSHEVTN 

SDCVLSFFLSEAGQQTYEESSPVYLGEKFPDK 

KPKVQLVISYEDVKLT1LVKHMKNIHLPDGSA 

PSAHVEFYLLPYPSEVRRRKTKSVPKCTDPTY 

NEIWYDEVTELQGHVLMLIVKSKTVFVGAI 

NIRLCSVPLDICEKWYPLGNSII*PLLLFSSFGM 

KSLEKDEFVGGMLLSNT1W 


470 


1820 


A 


3718 


430 


75 


SHGSISDLNLHQGCVFLPSLPAQGLRCYRCLA 
VLEGASCSWSCPFLDGVCVSQKVSV/CWQ+/ 
CPWGAJRAEGRLSAWDSQISCCKGDLCNAV 
VLAAGSPWALCV QLLLSLGS VFLW ALL 


471 


1821 


A 


3723 


891 


494 


LRQSL/NSVPQAGVQWRDSSLQAPPPRFTPLS 
CLSLPS S WDYRRLPPCLANFLYF* *RRGFTML 
ARMVLIS+PRDPPASASQ\STEITGGSHRAQHP 
TDSRDHSERSVKKSHEVISELRMKVIKCKVAF 
SKNPI 


472 


1822 


A 


3734 


443 


251 


GFIET*NFCVSKDTSKKLS/RLPTKWKNVFAN 
♦ISDKGLVSRICQELLRHLDAEQVSSTAGLSL 


473 


1823 


A 


3746 


3 


500 


THASGGARSGAGWAGRGVRAGTEAGRGGIF 

LTLSILRTRDLPSGAMSEGVDLIDIYADEEFNQ 

DPEFNNTDQIDLYDDVLTATSQPSDDRSSSTE 

PPPPVRQEPSPKPNNKTPAILYTYSGLRNRRA 

AVYVGSFSWWTTDQQLIQV1RSIGVYDVGEV 

KFAENRAK 


474 


1824 


A 


3753 


2 


5262 


RPLFAREGGIYAVLVCMQEYKTSV\LVQQAG 

LAALKMLAVASSSEIPTFVTGRDSIHSLFDAQ 

MTREIFASIDSATRPGSESLLLTVPAAVILMLN 

TEGCSSAARNGLLLLNLLLCNHHTLGDQirTQ 

ELRDILFRHSGIAPRTEPMITmTILMMLLNR 

YSEPPGSP\ERAALETPIIQGQDGSPELLIRSLV 

GGPSAELLLDLERVLCREGSPGGAVRPLLKRL 

QQETQPFLLLLRTLDAPGPNKTLLLSVLRVIT 

RLLDFPEAMVLPWHBVLEPCLNCLSGPSSDSE 

IVQELTCFLHRLASMHKDYAWLCCLGAKEI 

LSKVLDKHSAQLLLGCELRDLVTECEKYAQL 

YSNLTSSILAGCIQMVLGQIEDHRRTHQPINIP 

FFDVFLRHLCQGSSVEVKEDKCWEKVEVSSN 

PHRASKLTDHNPKTYWESNGSTGSHYITLHM 

HRGVLVRQLTLLVASEDSSYMPARVWFGG 

DSTSCIGTELNTVNVMPSASRVILLENLNRFW 

PHQIRIKRCQQGGIDTRVRGVEVLGPKPTFWP 

LFREQLCRRTCLFYTIR^QAWSRDIAEDHRRL 

LQLCPRLNRVLRHEQNFADRFLPDDEAAQAL 

GKTCWEALVSPLVQNTTSPDAEGVSALGWLL 

DQYLEQRETSRNPLSRAASFASRVRRLCHLL 

VHVEPPPGPSPEPSTRPFSKNSKGRDRSPAPSP 

VLPSSSLRNITQCWLSWQEQVSRFLAAAWR 

APDFVPRYCKLYEHLQRAGSELFGPRAAFML 
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ALRSGFSGALLQQSFLTAAflMSfeQFARYIDQ 

QIQGGLIGGAPGVEMLGQLQRHLEPIMVLSG 

LELATTFEHFYQHYMADRLLSFGSSWLEGAV 

LEQIGLCFPNRLPQLMLQSLSTSEELQRQFHLF 

QLQRLDKLFLEQEDEEEKRL*EEEEEEEEEEA 

EKELFIEDPSPAISILVLSPRCWPVSPLCYLYHP 

RKCLPTEFCDALDRFSSFYSQSQNHPVLDMG 

PHRW.QWTWLGRAELQFGKQILHVSTVQMW 

LLLKFNQTEEVSVETLLKDSDLSPELLLQALV 

PLTSGNGPLTLHEGQDFPHGGVLRLHEPOPQ 

RSGEALWL1PPQAYLNVEKDEGRTLEQKRNL 

LSCLLVRILKAHGEKGLfflDQLVCLVLEAWQ 

KGPNPPGTLGHTVAGGVACTSTD VL SCDLHLL 

GQGYVKRRDDRPQILMYAAPEPMGPCRGQA 

DVPFCGSQSETSKPSPEAVATLASLQLPAGRT 

MSPQEVEGLMKQTVKQVQETLNLEPDVAQH 

LLAHSHWGAEQLLQSYSEDPEPLLLAAGLCV 

HQAQAVPVRPDHCPVCVSPJLGCDDDLPSLCC 

MHYCCKSCWNEYLTTRIEQNLVLNCTCPIAD 

CPAQPTCAFIRAIVSSPEVISKYEKALLRGYVE 

SCSNLTWCTNPQGCDRILCRQGLGCGTTCSK 

CGWASCFNCSFPEAHYPASCGHMSQWVDDG 

GYYDGMSVEAQSKHLAKLISKRGPSCQAPIE 

KNEGCLEIMTGAKCRHGFCWRCLKSWKPNH 

KDYYNCSAMVSKAARQEKRFQDYNERCTFH 

HQAREFAVNLRNRVSAIHEVPPPRSFTFLNDA 

CQGLEQARKVLAYACVYSFYSQDAEYMDW 

EQQTENLELHTNALQILLEETLLRCRDLASSL 

RLLRADCLSTGMELLRRIQERLLAELQHSAQD 

FRVGLQSPSVEAWEAKGFNMPGSQPQASSGP 

EAEEEEEDDEDDVPEWQQDEFDEELDNDSFS 

YDESENLDQE I'FFFGDEEEDEDEAYD 


475 


1825 


A 


3754 


1093 


96 


GTSRNQHSPKTHA*RSS/WPQPPPLFLPPLQPQ 

ATGRRRRRTRTQQRTAALLTDGTTKTGAAW 

SRRPSLCWPSRTTGAPGAK+ AVL VRS ATPTTN 

PFNPQSPTGAAGKLRAPGNRAG/SEPSSQEPPP 

DGTR\RPASITGVAQSPATRATPSLPCLHVPAP 

SRGQTLGVR1TGRASRLTVDRSRLSWPGRSA 

RSGGGRWRPNAFRGRWPRAP* SWEPGS WTE 

PWRWFFPAAESPPHRCIYCTNHVSPAGPARPS 

HVYHRATINSISHPLCRAQSSPWEAAGVWRR 

PAQPAPTSDVNINLLRKPRVKRHDLIYQFLGN 

TLWEEGRQRPPETLQPAR 


476 


1826 


A 


3758 


901 


521 


FFFGNGVSPCPQAGV*WHDLDSLQNLPPGFK 
RFSYLSLPSSW\DYRHVPPRQANFCIF/M+RRG 
FTMLARMVSIS*PRDLPALASQSAGITGVSHH 
APPQMDFTFALLCFAPKGCLPRQKEGGTLNLI 


477 


1827 


A 


3761 


843 


575 


GVISAHCKLRL/CHLPGSSNSPASASQVAGTIG 

ARTTPS*IFYFLVETGFHHVSQDGLDLL/NFV1 

RPRRPLKVLGLQACTRARU^PLKEL 


478 


1828 


A 


3763 


267 


1240 


HLLSFHLWSASLDCLEQLSQERHVKGMLLGP 

PPVNESTKPSPSPWKLTPPMCSIPPVFPPKSGS 

PTTSWS/PSGHSKLEVERAQTGPFCLHIYCP*P 

GVTDNTTSLLHYIPFPRL\SGLVCFPAH*FPSY 

WTOHSFASQAWLRQVPEVSKHLQCPSAESLL 

TMEYHQPEDPAPGKAGTAEAVIPENHEVLAG 

PDEHPQDTDARDADGEAREREP/RRPSFAA*P 

V WGQPVESFLPEAS SAPPGPTLGTLPEVETiRA 

CSMPQELP*SPRTRQPEPDFYCVKWIPWKGE 

QTPIITQSTNGPLPSPCHHEHPLSSVEGEAPPA 
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cuauruu _j 


479 


1829 


A 


3766 


2 


2152 


YSPLRJLLEVCVPLPKIF1KRQAPLKVSLLQDLK 

DFFQKVSQVYVAIDERLASLKTDTFSKTREEK 

MEDIFAQKEMEEOEFKNWIEKMQARLMSSS 

VDTPQQLQSVFESUAKKQSLCEVLQAWNNR 

LQDLFQQEKGRKRPSVPPSPGRLRQGEESKIS 

AMDASPKNISPGLQNGEKEDRFLTTLSSQSST 

SSTHLQLPTPPEVMSEQSVGGPPELDTASSSE 

DVFDGHLLGSTDSQVKEKSTMKAIFANLLPG 

NSYNPIPFPFDPDKHYLMYEHERVPIAVCEKE 

PSSIIAFALSCKEYRNALEELSKATQWNSAEE 

GLPTNSTSDSRPKSSSPIRJLPEMSGGQTNRTTE 

TEPOPTKKASGMLSFFRGTAGKSPDLSSOKRE 

TLRGADSAYYQVGQTGKEGTENQGVEPQDE 

VDGGDTQKKQLINPHVELQFSDANAKFYCRL 

YYAfiEI^KMRPVILDSSEEDFnvSLSHSSPWO 

ARGGKSGAAFYATEDDRFILKQMPRUEVQSF 

LDFAPH YFN Y 1TN A VQQKRPTALAKILGVYRI 

GYKNSQNNTEKKLDLLVMENLFYGRKMAQ 

VFDLKGSLRNRNVKTDTGKESCDVVLLDENL 

LKMVRDNPLYIRSHSBCAVLRTSIHSDSHFLSS 

HLIIDYSLLYGRDDTSNELWGIIDY1RTFTWD 

KKLEMVVKSTGILGGQG*MPTVVSPELYRTR 

FCEAMDNYFLMVPDHCTGLGLNC 


480 


1830 


A 


3777 


251 


3 


QGCGSAGTLIHY**ECKMVQLLWKTV* QFLI 

KLNI\KDPATTLDVYPNE\ r KNYVRTKTYTQMF 

I/ANFMAKSWKQPTHPSVRT 


481 


1831 


A 


3779 


333 


3 


EAAIRQPEPNILDVNQIFKDLAMIIEIDQGDLID 
SEBAN AES SE VL VERAPG QLQRPAVYY QKK S R 
KmCtVVLVQTAIILICERIM+VVYTrKWSPPI 
VLPVSCFQGQKFN 


482 


1832 


A 


3780 


2 


371 


TGGRQGKNDHTSITEKPSRDFNRHL1TQNI*M 
PNQDMKSSSNSLinvKVQIKPTILYHHIFTRKA 
KMKTTDKTKYR* GFKAITTLIHCSQDCI£LQ * S 
IL* ENHFM1FPKAEQHIT YDTTIPFLR 


483 


1833 


A 


3787 


43 


44* 


LMKDLSP^'VMETHYILNRIJ^R/RSMWRHIIG 
KLPNTKDQEKHKAIRGRREYIQGS/RQQYRR 
PAAFSAAEKARRLWCS/VFNIERRNL/CEYPTK 
L SFNIKGEIvn'FSDKTEFTTNRPSLKMLLKDRI 
QEEGKMF*K£KCFKRKE 


484 


1834 


A 


3798 


1 


727 


FFFFETESRSVAQAGVQWCNLGSLQALPPGR 

SHSPASASRVAGTTGTRH*ARLIFYIFSRDGVS 

PC*PGWS*SPDLVIRPP\RLPKCWDYRREPPRP 

A*FmLVE\QGrTM^V[VSIS*PQ/CDLPAS 

VSQNAGITGVSHCAWPCLHFCFFGFFFEMESC 

SVAQAEVQWHDLRSLQAPPPGFTPFSCLSLPG 

SWDYRRPPPRPANIACIFSRDGVSPC*PGWSRS 

PDLVIRPPRPPKVLGLQA 


485 


1835 


A 


3802 


1 


239 


FFFFEMECLTVSQAGVQWYNLHSLQPLPPGF 
KQFSC\LSLPSSWD*RVPTSRPAKF/CVIF*DGV 
SHCQPGWSAWQPPLH 


486 


1836 


A 


3811 


378 


98 


RYD* SSQSENIP\QKEFLLKYP*CTATLGMRN 

MSIMKKKSIFSAEFYKVSLPSLLL\HLLAIEWG 

FHIEIQLTIHQHFLNYELESDFVHIVEYM 


487 


1837 


A 


3814 


771 


320 


FDPD WTKAAG1RHEKKPKALAYRRENSPGDL 

PPPPLPPPEEEASWAL/GAEGSRQHVLPGAGA 

QWGEESGPGRAPGSPAGAPPR*RGLAP\NSRP 

SFLSRGQGTSTCSTAGSNSSRGSSSSRGSRGPG 

RSRSRSQSRSQSQRPGQKRREEPR 
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488 


1838 


A 


3818 


1 


781 


FRACLLELIPYAPTLSWTACPPAMAGPRGLLP 

LCLIAFCLAGFSFVRGQVLi^GCDVKTTFVT 

HVPCTSCAAIKKQTCPSGWLRELPDQITQDCR 

YEVQLGGSMVSMSGCRRKCRKQWQKACCP 

GYWGSRCHECPGGAETPCNGHGTCLDGMDR 

NGTCVCQENFRGSACQECQDPNRFGPDCQSV 

CSCVHGVCNHGPRGDGSCLCFAGYTGPHCD 

QELPVWQELGFPQNNPRLRKAPNCKCLPG*H 

RNGLIATPNPCRP 


489 


1839 


A 


3822 


934 


669 


FFFSEMESRSVTRLECSGAISAHLRLLGSSNSP 
ASAS * VAGTIGACHHAQLIFVFLVETGFHHVG 
QDGLDLUNLMIHPPRPPKVLGFQA 


490 


1840 


A 


3825 


79 


9748 


GCQSCWPAWPRLRRRGPASAGARLGRKAPW 

GLPGRVQDGRPLRFCFYLRPRAPFIAPVLSGA 

ASRPEASGDCRAGRETAMATLEKXMKAFESL 

KSFQQQQQQQQQQQQQQQQQQQQQQQPPPP 

PPPPPPPQLPQPPPQAQPLLPQPQPPPPPPPPPP 

GPAVAEEPLHRPKKELSATKKDRVNHCLTIC 

ENTVAQSVRNSPEFQKLLGIAMELFLLCSDDA 

ESDVRMVADECLNKVrKALMDSNLPRLQLEL 

YKEIKKNGAPRSLRAALWRFAELAHLVRPQK 

CRPYLVNLLPCLTRTSKRPEESVQETLAAAVP 

KJMASFGNFANDNEIKVLLKAFIANLKSSSPTT 

RRTAAGSAVSICQHSRRTQYFYSWLLNVLLG 

LLVPVEDEHSTLL1LGVLLTLRYLVPLLQQQV 

KDTSLKGSFGVTRKEMEVSPSAEQLVQVYEL 

TLHHTQHQDHNWTGALELLQQLFRTPPPEL 

LQTLTAVGGIGQLTAAKEESGGRSRSGSLVELI 

AGGGSSCSPVLSRKQKGKVLLGEEEALEDDS 

ESRSDVSSSALTASVKDEISGELAASSGVSTPG 

SAGHDIITEQPRSQHTLQADSVDLASCDLTSS 

ATDGDEEDILSHSS SQVSA VPSDPAMDLNDG 

TQASSPISDSSQTTTEGPDSAVTPSDSSEIVLD 

GTDNQYLGLQIGQPQDEDEEATGILPDEASEA 

rTWSSMALQQAHLLKNMSHCRQPSDSSVDKF 

VLRDEATEPGDQENKPCRIKGDIGQSTDDDS 

APLVHCVRLLSASFLLTGGKNVLVPDRDVRV 

SVKALALSCVGAAVALHPESFFSKLYKVPLD 

TTEYPEEQYVSDILNYIDHODPQVRGATAILC 

GTLICSILSRSRFHVGDWMGTIRTLTGNTFSL 

ADCPLLRKTLKDESSVTCKLACTAVimCVM 

SLCSSSYSELGLQLIIDVLTLRNSSYWLVRTEL 

LETLAEIDFRLVSFLEAKAENLHRGAHHYTGL 

LKLQERVLNNWIHLLGDEDPRVRIIVAAASL 

IRLVPKLFYKCDQGQADPWAVARDQSSVYL 

KLLMHETQPPSHFSVSTITRIYRGYNLLPSITD 

VTMENNLSRVIAAVSHELITSTTRALTFGCCE 

ALCLLSTAFPVCIWSLGWHCGVPPLSASDESR 

KSCTV GMATMILTLLSS AWFPLDLSAHQDAL 

ELAGNIXAASAPKSLRSSWASEEEANPAATK 

QEEVWPALGDRALVPMVEQLFSHLLKVINIC 

AHVLDDVAPGPAIKAALPSLTNPPSLSPIRRK 

GKEKEPGEQASVPLSPKKGSEASAASRQSDTS 

GPVTTSKSSSLGSFYHLPSYLKLHDVLKATHA 

NYKVTLDLQNSTEKFGGFLRSALDVLSQILEL 

ATLQDIGKCVEEILGYLKSCFSREPMMATVC 

VQQLLKTLFGTNLASQFDGLSSNPSKSQGRA 

QRLGSS S VRPGLYHYCFMAP YTHFTQAL ADA 

SLRNMVQAEQENDTSGWFDVLQKVSTTQLKT 

NLTSVTKKRADK^JAIHNHIRLFEPLVIKALKQ 
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DSDQWIGFVLKQFEYIEVGQFRESEAIIPNIFF 

FLVLLSYERYHSKQUGIPKIIQLCDG1MASGR 

KAVTHAIPALQPIVHDLFVLRGTNKADAGKE 

LETQKEWVSMLLRLIQYHQVLEMFILVLQQ 

CHKENEDKWKRLSRQ1ADIILPMLAKQQMH1 

DSHEALGVLNTLFEILAPSSLRPVDMLLRSMF 

VTPNTMASVSTVQLWISGILAILRVUSQSTED 

IVLSRIQELSFSPYLISCTVINRLRDGDSTSTLE 

EHSEGKQDCNLPEETFSRFLLQLVGILLEDIVT 

KQ1XVEMSEQQHTFYCQELGTLLMCUHIFKS 

GMFRR1TAAATRLFRSDGCGGSFYTLDSLNLR 

ARSMHTHPALVLLWCQILLLVNHTDYRWW 

AEV QQTPKRHSLSSTKLLSPQMSGEEEDSDL A 

AKLGMCNREIVRRGALILFCDYVCQNLHDSE 

HLTWLIVNHIQDLISLSHEPPVQDFISAVHRNS 

MSGIJFIQAIQSRCENLSTTTMLKKTLQCLEGI 

HLSQSGAVJLTLY VDRLLC1PFR VL ARM VDJL 

ACRRVEMLLAANLQSSMAQLPMEELNR1QEY 

LQSSGLAQRHQRLYSLLDRFRLSTMQDSLSPS 

PPVSSHPLDGDGHVSLETVSPDKDWYVHLVK 

SQCWTRSDSALLEGAELVNRIPAEDMNAFM 

MNSEFNLSLLAPCLSLGMSE1SGGQKSALFEA 

AREVTLARVSGTVQQLPAVHHVFQPELPAEP 

AAYWSKLNDLFGDAALYQSLPTLARALAQY 

LVWSKLPSHLHLPPEKEKD1VKFVVATLEAL 

SWHLIHEQIPLSLDLQAGLDCCCLALQLPGL 

WSWSSTIWIHACSLIYCVHFILEAVAVQPG 

EQLLSPERRTNTPKAISEEEEEVDPNTQNPKYI 

TMCEMVAEMVESLQSVLALGHKRNSGVPA 

FLTPLLRNniSLARLPLVNSYTRVPPLVWKLG 

WSPKPGGDFGTAFPEIPVEFLQEKEVFKEFIYR 

INTLGWTSRTQFEETWAIULGVLVTQFLVME 

QEESPPEEDTERTQINVLAVQAITSLVLSAMT 

VPVAGNPAVSCLEQQPRNKPLKALDTRFGRK 

LSHRGIVEQBIQAMVSKREN1ATHHLYQAWD 

PVPSLSPATTGALISHEKLLLQINPERELGSMS 

YKLGQVSIHSVWLGNS1TPLREEEWDEEEEEE 

ADAPAPSSPPTSrVNSRKHlUGVDIHSCSQFL 

LELYSRWILPSSSARRTPAHISEVVRSLLVVS 

DLin-ERNQFELMYVTLTELRRVHPSEDEILAQ 

YLVPATCKAAAVLGMDKAVAEPVSRLLESTL 

RSSHLPSRVGALHGVL YVLECDI 1 .DDTAKOI . 

IPVISDYLLSNLKGIAHCVNIHSQQHVLVMCA 

TAFYLIENYPLDVGPEFSAS1IQMCGVMLSGS 

EESTPSIIYHCALRGLERLLLSEQLSRLDAESL 

VKLSVDRVNVHSPHRAMAALGLMLTCMYT 

GKEKVSPGRTSDPNPAAPDSESVIVAMERVS 

VLFDRIRKGFPCEARWARILPQFLDDFFPPQ 

DIMNKVIGEFLSNQQPYPQFMATVVYKVFQT 

LHSTG QSSM VRD WVMLSLSNFTQRAP VAMA 

TWSLSCFFVSASTSPWVAAELPHVISRMGKLE 

QVDVNLFCLVATDFYRHQIEEELDRRAFQSV 

LEVVAAPGSPYHRLLTCLRNVHKVTTC 


491 


1841 


A 


3826 


469 


302 


SNPPASASRVAGITGVHQHAWLIFVFLVEMEF 
HHVGQAVLKLLISGDLPVSASQSA 


492 


1842 | 


A 


3836 


392 


88 


VAPSPMIMPDLYFYRDPEE1EKEE+AAAEKAEE 
FQSEWTAW/PyEFTATQSEVADWFKDMQVP 
SVPIQQFPTEDWST*PTMNDWSATSTAQTTE 
WVRITTEWP 
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493 


1843 


A 


3838 


19 


380 


TPSDMNRAFETDTQSIGEKNRSPSEPDYFERK 
KFKRS* EKAHIR YKIDQPEDIPLK\EFLCKHSK 
CTATLSMRNMSLMKKXCSFSEEF\LAFFPS£X 
VCHIXAIKXGFYIE1HLTTFNNTF 


494 


1844 


A 


3845 


2 


352 


FFFLRRSL/D S V AQ AEAQ WL\ELGLLQ APPP GF 
KPISLP\GLPSSWDYGRPPPCPANFCIF/M*RRG 
FTVLARMVLIS*PCDPPTLASQGTArrGMSYH 
ARPQDIDFLYAHQGRCWFRLL 


495 


1845 


A 


3847 


1774 


40 


DlFFRftAKEG^GODEAQFSVEMPLTGKA^L 

WADKYRPRKPRFFNRVHTGFEWNKYNQTHY 

DFDNPPPKIVQGYKFNIFYPDLIDKRSTPEYFL 

EACADNKDFAILRFHAGFPYEDIAFKTVNREW 

EYSHRHGFRCQFANGIFQLWFHFKRYRYRR* 

RP WGTAGRCPRGHSKGAS VKLVVTPGPL SGL 

QGRGFTSHLRPHLSFARPQFPPI*KGGHH*AC 

HGELRRHWDRLA* GPDATEGALGASFEHEG 

GQQPPADLTVQADTLHRPSARLGGAHRACPK 

RRPHRVLWRWARGAWAWRCQAREKQETQG 

QPCHTTGHPLGREAEPAAAGAAPALAHRPPF 

ARTGSTEVPGPCWRPIRHCRRDPLWTPTLORD 

WPFrHPVLAG G VHFP AAG/IGGC VEVP VS VN 

VMGTKSH*AVLPPPPSTGPGGQGLPEGWGLE 

KGEGLPPGIPPPGLLTGPWNSMRPVTPSFAHIR 

TVAPSHSPFSGQEGRGPHGCHSPGR\SGP\AGR 

lvlqhptgtspteakrkvppgppeghptspvt 
sprpptapprhpassgnssvcfskktcrwekk: 
sfvlmelaywqdrmff 


496 


1846 


A 


3849 


830 


442 


AKSPLPLG*IQWR/NLGSLKLRLPGFK*FTCLG 
LLSSWDYRSLPPRPVNFCILVELGFHHVDQAG 
LKLLTSSALPALASQSAEITGMSHRIWPLPLLR 
RPPVIRIRAPPQRLPF^ITSLKjaSPNMATF 


497 


1847 


A 


3859 


2 


393 


ALRKTRRDGIARTGAQPAASWKGTNNYPWR 

LEMAGRPGSQEQSKDRGTGSLPPPSQRPLGPS 

PEGAGPSPPPPGIPRGGGS3SSEGP/PQLLFVPR 

RFPAPKKGLPSDTPHSKAPPTPHLILGGEDSQ 

VPIL 


498 


1848 


A 


3860 


253 


634 


KNASTVYSSQGDPKSFFFLLRWSLALVAQAG 
EQ*RDLSSLQPPPPGFK*FSCLSLPSSWD\YRCP 
LPC3-ANF\*FLVETGFHHVGQADLiaLTSGDP 
PTSASESAGITGVSHRAWPRIHFLYWKTFFL 


499 


1849 


A 


3863 


423 


263 


APSQISVAFLYAA/DKLFEKEI*KKIPFIIAS/DKI 

KIGINLTKEVKYLYTENYTTLMKEIKyDTDKW 

KDILY*WIGKINI*KMSTPPKAIYRFNAIPTKIP 

MTFFrennCSIIICFIWNI IKKPPNTQ SMEQKE* S 

FCSILLWVFGGFLWFHMNFMIDFSISVKNVIGI 

LVGIALNL 


500 


1850 


A 


3865 


2 


15246 


LPRGCLWCLQRSPTPARPQPSRPARSPLPLFP 
DLRPWASDLDIMGDAEGEDEVQFLRTDDEV 
VLQCSATVLKEQLKLCLAAEGFGNRLCFLEP 
TSNAQNVPPDLAICCFVLEQSLSVRALQEML 
ANTVEAGVES SQGGGHRTLL YGHAILLRHAH 
SRMYLS CLTTSR5MTOKLAFDVGLQEDATGE 
ACWWTMHPASKQRSEGEKVRVGDDIILVSVS 
SERYLHLSTASGEIXJVDASFMQTLWNMNPIC 
SRCEEGFVTGGHVLRLF1 IGHMDECLTISPADS 
DDQRRL YYYEG G AVCTHARS L WRLEPLRIS 
WSGSHLRWGQPLRVRHVTTGQYLALTEDQG 
LVWDASKAHTKATSFCFR1SKEKLDVAPKR 
D VEGMGPPEIKYGESLCFV QHVASGLWLTYA 
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A t>T%J>VLAlJ>l..rs\n.VXX-A\A1. TJfXC ntl\ATW\ A T CT 

TRCX5QEESQAARM1HSTNGLYNQFIKSLDSFS 

GKPRGSGPPAGTALP1EG VILSLQDUI YFEPPS 

EDLQHEEKQSKLRSLRNRQSLFQEEGMLSMV 

LNCIDRLNVYTTAAHFAEFAGEEAAESWKEI 

VNLLYELLASLIRONRSNCALFSTNLDWLVS 

KLDRLEASSGILEVLYCVL1ESPEVLN1IQENHI 

KSIISLLDKHGRNHKVLD VLCSLCV CNG VAV 

RSNQDLITENLLPGRELLLQTNLINYVTSIRPN 

1FVGRAEGTTQYSKWYFEVMVDEVTPFLTAQ 

ATHLRVGWALTEGYTPYPGAGEGWGGNGV 

GDDLYSYGFDGLHLWTGHVARPVTSPGQHL 

LAPEDVISCCLDLSVPSISFRINGCPYQGVFFSF 

NLDGLFFPWSFSAGVKVRFLLGGRHGEFKF 

LPPPGYAPCHEAVLPRERLHLEPDCEYRREGP 

RGPHLVGPSRCLSHTDFVPCPVDTVQIVLPPH 

LERIREKLAENIHELWALTR1EQGWTYGPVRD 

DNKRLHPCLVDFHSLPEPERNYNLQMSGETL 

KTLLALGCHVGMADEKAEDNLKKTKLPKTY 

MMSNGYKPAPLDLSHVRLTPAQTTLVDRLAE 

NGHNVWARDRVGQGWSYSAVQDIPARKNPR 

LVPYRLLDEATKRSNRDSLCQAVRTLLGYGY 

NIEPPDQEPSQVENQSRCDRVRIFRAEKSYTV 

QSGRWYFEFEAVTTGEMRVGWARPELRPDV 

ELGADELAYVFNGHRGQRWHLGSEPFGRPW 

QPGDVVGCMIDLTENT[IFTLNGEVLMSDSGS 

ETAFREIE1GDGFLPVCSLGPGQVGHLNLGQD 

VSSLRFFAICGLQEGFEPFAINMQRPVTTWFS 

KGLPQFEPVPLEHPHYEVSRVDGTVDrPPCLR 

LTHRTWGSQNSLVEMLFLRLSLPVQFHQHFR 

CTAGATPLAPPGLQPPAEDEARAAEPDPDYE 

NLRRSAGGWSEAENGKEGTAKEGAPGGTPQ 

AGGEAQPARAENEKDATTEKNKKRGFLFKA 

KKVAMMTQPPATPTLPRLPHDVVPADNRDD 

PEIILNTmTYSVRVFAGQEPSCVWAGWVT 

pdyhqhdmsfdlskvrwtvtmgdeqgnv 
hsslkcsncymvwggdfvspgqqgrishtdl 
vigcl'vdlatglmtft^gkesntffqvepn 
tklfpavfvlpthqnviqfelgkqknimplsa 

AMFQSERKNPAPQCPPRLEMQMLMPVSWSR 

MPNHFLQVETRRAGERLGWAVQCQEPLTMM 

ALHIPEENRCMDILELSERLDLQRFHSHTLRL 

YRAVCALGNNRVAHALCSHVDQAQLLHALE 

DAHLPGPLRAGYYDLLISIHLESACRSRRSML 

SEYIVPLTPETRAITXFPPGRSTENGHPRHGLP 

GVGVTTSLRPPHHFSPPCFVAALPAAGAAEAP 

ARLSPAIPLEALRDKALRMLGEAVRDGGQHA 

RDPYGASVEFQFVPVLKLVSTLLVMGIFGDE 

DVKQILKMIEPEVFTEEEEEEDEEEEGEEEDEE 

EKEEDEEETAQEKEDEEKEEEEAAEGEKEEG 

LEEGLLQMKLPESVKLQMCFDLLEYFCDQELQ 

HRVESLAAFAERYVDKLQANQRSRYGLLKA 

FSMTAAETARRTREFRSPPQEQINMLLQFKDG 

TDEEDCPLPEEIRQDLLDFHQDLLAHCGIQLD 

GEEEEPEEETTLGSRLMSLLEKVRLVKKKEEK 

PEEERSAEESKPRSLQELVSHMWRWAQEDF 

VQSPELVRAMFSLLHRQYDGLGELLRALPRA 

YTISPSSVEDTMSIJLECLGQIRSIXIVQMGPQE 

ENLMIQSIGN1MNNKVFYQHPNLMRALGMHE 

TVMEVMVNVLGGGESKEIRFPKMVTSCCRFL 
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M=Methionine, N=Asparagine, P=Proline, 
Q=GIutamine, R=Arginine, S=Serine, 
T=Threonine, V=Valine, W=Tiyptophan, 
Y«Tyrosine, X=Unknown, *~Stop codon, 
A»possible nucleotide deletion, V=possible 
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■ 








C YFCRISRQNQRSMFDHLSYLLEN SGIGLGM 

QGSTPLDVAAASVIDNNELALALQEQDLEKV 

VSYLAGCGLQSCPMLVAKGYPDIGWKPCGG 

ERYLDFLRFAVF VNGES VEENAN WVRLLIR 

KPECFGPALRGEGGSGLLAAIEEAIRISEDPAR 

DGPGIRRDRRREHFGEEPPEENRVHLGHAIMS 

FYAALIDLLGRCAPEMHUQAGKGEALRIRA1 

LRSLVPLEDLVGIISLPLQIPrLGKDGALVQPK 

MSASFVPDHKASMVLFLDRVYGIENQDFLLH 

VLDVGFLPDMRAAASLDTATFSTTEMALAV 

NRYLCLAVLPLITKCAPLFAGTEHRAIMVDS 

MLHT\nfRLSRGRSLTKAQRDVIEDCLMSLCR 

YIRPSMLQHLLRRLVFDVPILNEFAKMPLKLL 

TOHYERCWKYYCLPTGWANFGVTSEEELHL 

TRKLFWGIFDSLAHKKYDPELYRMAMPCLC 

AIAGALPPDYVDASYSSKAEKKATVDAEGNF 

DPRPVETLNVnPEKLDSFINKFAEYTHEKWAF 

DKIQNNWSYGENIDEELKTHPMLRPYKTFSE 

KDKEIYRWP1KESLKAM1AWEWTIEKAREGE 

EEKTEKKKTAKISQSAQTYDPREGYNPQPPDL 

SAVTLSRELQAMAEQLAENYHNTWGRKKKQ 

ELEAKGGGTHPLLVPYDTLTAKEKARDREKA 

QELLKFLQMNGYAVTRGLKDMELDSSSIEBCR 

FAFGFLQQLLRWMDISQEFTAHLEAWSSGRV 

EKSPHEQEEKFFAKILLPLINQYFTOHCLYFLS 

TPAKVLGSGGHASNKEKEMITSLFCKLAALV 

RHRVSLFGTDAPAWNCLHILARSLDARTVM 

KSGPEIVKAGLRSFFESASEDIEKMVENLRLG 

KVSQARTQVKGVGQNLTYTTVALLPVLTTLF 

QHIAQHQFGDDVIIJDDVQVSCYRTLCSIYSLG 

TTKNTYVEKLRPALGECLARLAAAMPVAFLE 

PQLNEYNACSVYTTKSPRERAILGLPNS VEEM 

CPDIPVLERLMAD1GGLAESGARYTEMPHV1E 

ITLPMLCSYLPRWWERGPEAPPSALPAGAPPP 

CTAVTSDHLNSLLGN1LR1IVNNLGIDEASWM 

KRLAVFAQPIVSRARPELLQSHFIPTIGRLRKR 

AGKWSEEEQLALEAKAEAQEGELLVRDEFS 

VIXRDLYALYPLLIRYVDNNRAQWLTEPNPS 

AEELFRMVGEIFIYWSKSHNPKREEQNFVVQ 

NEINNMSFLTADNKSKMAKAGDIQSGGSDQE 

RTKJKKRRGDRYSVQTSUVATLKKMLPIGLN 

MCAPTDQDLITLAKTRYAiKDTDEEVREFLH 

NNLHLQGKVEGSPSLRWQMALYRGVPGREE 

DADDPEKIVRRVQEV S AVLYYLDQTEHPYKS 

KKAVWHKLLSKQRRRAWACFRMTPLYNLP 

THRACNMFLESYKAAWtLTEDHSFEDRMIDD 

LSKAGEQEEEEEEVEEKKPDPLHQLVLHFSRT 

ALTEKSKLDEDYLYMAYADIMAKSCHLEEG 

GENGEAEEEVEVSFEEKQMEKQRLLYQQARL 

HTRG AAEMVLQMI S ACKGETGAMVSSTLKL 

GISILNGGNAEVQQKMLDYLKDKKEVGFFQS 

IQALMQTCSVLDLNAFERQNKAEGLGMVNE 

DGTVINRQNGEKVMADDEFTQDLFRFLQLLC 

EGHNNDFQNYLRTQTGNTITINIIICTVDYLL 

RLQESISDFYWYYSGKDVIEEQGKRNFSKAM 

SVAKQVFNSLTEYIQGPCTGNQQSLAHSRLW 

DAWGFLHVFAHMMMKLAQDSSQIELLKEL 

LDLQKDMWMLLSLLEGNWNGMIARQMV 

DMLVESSSNVEMILKFFDMFLKLKDIYGSEAF 

QDYVTDPRGLISKKDFQKAMDSQKQFSGPEI 
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Hsoleucine, K=Lysine, L=Leucine, 
M=Methianine f N=Asparagine, P=Proline, 
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QFIXSGSEABENBiyilNCI^ 

FNVA VLLTNL SEHVPHDPRLHNFLELAESILE 

YFRPYLGRIEIMGASRRIERIYFEISETNRAQW 

EMPQVKESKRQFJFDVVNEGGEAEKMELFVS 

FCEDTIFEMQIAAQISEPEGEPETDEDEGAGA 

AEAGAEGAEEGAAGLEGTAATAAAGATARV 

VAAAGRALRGLSYRSLRRRVRRLRRLTAREA 

ATAVAALLWAAVTRAGAAGAGAAAGALGL 

LWGSLFGGGLVEGAKKVTVTELLAGNfPDPT 

SDEVHGEQPAGPGGDADGEGASEGAGDAAE 

GAGDEEEAVHEAGPGGADGAVAVTDGGPFR 

PEGAGGLGDMGDTTPAEPPTPEGSPILKRKLG 

VDGVEEELPPEPEPEPEPELEPEKADAENGEK 

EEVPEPTPEPPKKQAPPSPPPKKEEAGGEFWG 

ELEVQRVKFLNYLSRNFYTLRFLALFLAFAIN 

FILLFYKVSDSPPGEDDMEGSAAGDVSGAGS 

GGSSGWGLGAGEEAEGDEDENMVYYFLEES 

TGYMEPALRCLSLLHTLVAFLC1IGYNCLKVP 

LVIFKREKELARKLEFDGLYITEQPEDDDVKG 

QWDRLVLNTPSFPSNYWDKFVKRKVLDKHG 

DIYGRERIAELLGMDLATLEITAHNERKPNPP 

PGLLTWLMSIDVKYQIWKFGVIFTDNSFLYLG 

WYMVMSLLGHYNNFFFAAHLLDIAMGVKTL 

RTILSSWHNGKQLVMTVGLLAVVVYLYTW 

AFNFFRKFYNKSEDEDEPDMKCDDMMTCYL 

FHMYVGVRAGGGIGDE1EDPAGDEYELYRVV 

FDITFFFFVIVILLAI1QGU1DAFGELRDQQEQV 

KEDMETKCFICGIGSDYFDTOHGFETHTLEE 

HNLANYMFFLMYLINKDETEHTGQESYVWK 

MYQERCWDFFPAGDCFRKQYEDQLS 


501 


1851 


A 


3869 


467 


665 


VTVAIY CQLIFDKGAKTEQ*PFQQIAL/CKRMK 
LGPCFTPCGKINSEW1RELSVRVKT1KHLEIGV 
N 


502 


1852 


A 


3888 


1042 


724 


SGMQWRDLTPLQPLPPRFKQFSCLSLPGSWD 
YRHAP\PLLTNF\*FLVEV1GFCYVGQAGRKLL 
ASSDQSALASQSAGITGISTAPGPPFFFLNFEA 
GSCSVAQAGVQ 


503 


1853 


A 


3891 


1773 


1193 


EVDSQSGVQ*QAPGSLQLQTPGLKA^SCLLSR " 

QDYRSSLPHLASCCYYYYYY/VFL+RRGLTTL 

VQGGLKLLPSSNPFASAP*TAGITGMSHCAGP 

HFNF*MFRKISCIRE*F* HTRIYDIPFLILFFKET 

WVLLCYPGWPQIPGLKPSSCLRLLSSWDHRC 

APPCTASFFIFHVDRVSPPCPGLVS1TFKMLLL 

L 


504 


1854 


B 


3896 


279 


70 


MVSKSKS1LMSYNHVELTFSDMKKMPEAFRR 

TQKHTIYLIPYQVIFWSTGKDAMRSFMMPFY 

QKEYYENQ* 


505 


1855 


A 


3899 


2 


1396 


EPGVPTKKTWFDKPDFNRTNSPGFQKKVQFG 

NENTKLELRKVPPELNNISKLNEHFSRFGTLV 

NLQVAYNGDPEGALIQFATYEEAKKAISSTEA 

VLNNRFIKVYWHREGSTQQLQTTSPKVMQPL 

VQQPILPWKQSVKERLGPVPSSTIEPAEAQS 

ASSDLPQVLST\LLA*QKQCIIQLL/WKAAQKT 

LLVSTSAVDNNEAQKKKQEALKLQQDVRKR 

KQEILEKHIETQKMLISKLEKNKTMKSEDKAE 

IMKTLEVLTKMTKLKDEVKAASPGRCLPKSI 

KTKTQMQKELLDTELDLYKKMQAGEEVTEL 

RRKYTELQLEAAKRGILSSGRGRGIHSRGRGA 

VHGRGRGRGRGRGVPGHAVVDHRPRALEIS 



200 



WO 01/57188 



PCT/US01/03800 



SEQ ID 
NO: of 
nucl- 
eotide 
seq- 
uence 


SEQID 
NO: of 

nentide 

seq- 
uence 


Met 
hod 


SEQ 
ID NO: 
in 

USSN 
09/496 
914 


Predicted 

beginning 

nucleotide 

location 

correspondi 

ng to first 

amino acid 

residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue 
of peptide 
sequence 1 


Amino acid sequence (A^Alamne OCysteme, 
D=Aspartic Acid, E^lutamic Acid, 
F=Phcnylalamne, GNGlycine, H=Histidine, 
l=Isoleucine, K==Lysine, L=Leucine, 
M=Methionine, N=Asparagine, P=Proline, 
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AFTESDREDLLPHFAQYGEEtoCQlDDSSiHA 
VTTFKTRAEAEAAAVHGARFKGQDLKLAWN 
KPVTNISAVETEEVEPDEEEQREI1IA 




ioju 


A 


3911 


1952 


919 


DAELSGTLSLVLTQCCKRKDTVQKLASDHK 

DfflSSVSRVGKAIDKNFDSDISSVGIDGCWQA 

DSQRLLNEVMVEHFFRQGMLDVAEELCQES 

GLSVDPSQKEPFVELNRILEALKVRVLRPALE 

WAVSNREMLIAQNSSLEFKLIIRLYFISLLMG 

GTTNQREALQYAKNFQPFALNHQKD1QVLM 

GSLVYLRQGIENSPYVHLLDANQWADICDIFT 

RDACALLGLSVESPLSVSFSAGCVALPALINIIC 

AVIEQRQCTGVWNQKDELPIEV\DLG*KSAGY 

HSIFACPJLRQQTTDNNPPMKLVCGHirSRDAL 

NKMFNGSKLKCPYCPMEQSPGDAKQIFF 


507 


1857 


A 


3936 


439 


18 


SHPFSPAPG1CPDAPPPLPRPSKGLGHPGTAGA 

PGSGARCHPPSTCSPSWASPG*GAKASPALPR 

SHGVTLLCKAQAHLCRGEDSKDASGSTSQA 

WEPG*GAWGMPRCQGPALGSCFCPPGTTVQ 

RPAKQRDKRNRHLGR 


508 


1858 


A 


3944 


120 


412 


WCPAGTLDFPGPQEMVIXEIEVMNQLNHRNL 
IQLYAAIETPHEIVLFME\YECPK*W*GLGGGT 
TRHGASRpGVCAHSIEGGELFERIVDEDYHLT 
EV 


509 


1859 


A 


3949 


31 


392 


LTKTPSPREKGRGVLSVLLMMI* KCRVIFVKIP 
MVrTLQNFORffl>TV r A\WTGD*PNTL*KEQRG 
rTFSDSK^*YKATKIKTMWYCHICNRYID/ERN 
RIEDPEINPCICDKIIFRKLSMTTQ 


510 


1860 


A 


3954 


1013 


885 


FSETRACCPRLEHSORIR^HCSLMIPGSSDPFI' 
SASSVAATTG 


511 


1861 


A 


3956 


1 


1054 


PPAWAPRSPLIWAPTSGRHPCRAALPWSTSSV 

RWQPSEKQPPPPAHRGPADSLSTAAGAAELS 

AEGAGKSRGSGEQDWVNRPKTVRDTLLALH 

QHGHSGPFESKFKKEPALTAVARTARKRKPS 

PEPEGEVGPPK\TTERPSRGCPHPQRGSRSP*L 

LHPLLCli<HHPLPHL[PTGPHRLKRPRM\P\SP 

MAALILVADNAGGSHASKDANQVESTTRRN 

SNSPPSPSSMNQRRLGPREVGGQGAGNTGGL 

EPVHPASLPDSSlATSuAPLCCTLCHERLEDTli 

FVQCPSVPSHKFOPCSRQS1KQQGASGEVYC 

PSGEKCPLVGSNVPWAFMQGEIATILAGDVK 

VKKERDS 


512 


1862 


A 


3957 


1086 


3 


QDRARLDCSS ATS AHCNLRLPGS*D SPAS ASR 

VAGTTOTHHHTWLBLGSSVQTGFDHVGQAG 

LELLTSGDPPISASESAGIMGMSHCVWP* S WG 

LSHHMAPPQGDGGRARGTPGPEQSFWNLSC 

H*PRCQVPS*LMTQI^WGRHQYNPTMKRGK 

LRHREACSLPLPGEGEPGLQPSS\* SQNPCSSPL 

FHHGL*A\YLWCPELLLQGQARRH*RSPPS/FK 

CPATLSLTAWSQTBCRLRSQFLLLPWL*RAL*H 

PP^CHWPSRRSLGDPLLPRSQG*RDGT*ASTFC 

SYF*DTESHLVAQAGVQWRDLGSLQPPCPRL 

K\RFSRLSPPSSYTHRYVPSHLAESCISSRDRIP 

PSRPDRSRNSNSLSR 


513 


1863 


A 


3961 


3038 


476 


VALTTSMCCNKQVrVIDKIKSASIADRCGALH 

VGDHILSroGTSMEYCILAEATQFLANTiUQ 

VKLEILPHHQTRLALKGPDHVKIQRSDRQLT 

WDSWASNHSSLHTNHHYNTYHPDHCRVPAL 

TFPKAPPPNSPPALVSSSFSPTSMSAYSLSSLN 

MGTLPRSLYSTSPRGTMMRRRLKKKDFKSSL 
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i 

I 




SLAS STVOLAQQWHTETTE V VLTADPVTOF 

GIQLQGSWATETLSSPPLISYIEADSPAERCG 

VLQIGDRVMAINGIPTEDSTFEEASQLLRDSSI 

TSKVTLE1EFDVAESVIPSSGTFHVKLPKKHN 

VELGITISSPSSRKPGDPmSDIKKGSVAHRT 

GTLELGDKLLAIDNIRLDNCSMEDAVQILQQC 

EDLVKLKIRKDEDNSDEQESSGAIIYTVELKR 

YGGPLG\lTISGTCEP\FDI*nSSLTICGGLAERT 

GAIHIGDRILVAINSSSLKGKPLSEAIHLLQMAG 

ETVTLKIKKQTDAQSASSPKKFPISSHLSDLGD 

VEEDSSPAQKPGK1SDMYPSHGCPSVDSAVD 

SWDGSA\IDTS\YGTEGT\SFQASGY\NFNTYD 

WRSPKQRGS\LSPVT\KPRSQTYPDVGLSYED 

WDR5TAS GFAGAAID S AETEQ EENF WSQ ALE 

DLETCGQSGILRELEATIMSGSTMSLNHEAPT 

PRSPAGSDRPSFQERSSSRPHYSQTTRSNTLPS 

DVGRKSVTLRKMKQEDCE1MSPTPVELHKVT 

LYlfDSDMEDFOFSYADGLLEKGVYVKNIRPA 

GPGDLGGLKPYDRLLQVNHVRTRDFDCCLV 

VPLIAESGNKLDLVISRNPLASQKSBDQQSLPG 

U*bfcQNSAFFQQPSHGGNLETREPTNTL 


514 


1864 


A 


3967 


833 


800 


LEKQGVSGMATKRLARQLGLIRRKSIAPANG 

NLGRSKSKQLFDYLIVIDFESTCWNDGKHHH 

SQEnEFPAVLLNTSTGQIDSEFQAYVQPQEHPI 

LSEFCMELTGKQAQVDEGVPLKICLSQFCK 

WIHKIQQQKNIIFATG1SEPS/DF*SKIMCICYL 

VR*RISYTY*SKHKSKGC 


515 


1865 


A 


1969 


492 


182 


CRFWGISTHCDTCDPLSPQTTEG**EGDLWSLT 
DLLGPEFLARKPLFKTKTYQSTF*SISKNE/FTC 
PNFIIEEGTDLIF\* QVKHNPCHRLTPEEGT VQL 
NRADS 


516 


1866 


A 


3977 


2 


1357 


KMLC^QKESNYIRLKRAKMDKSMFVKKTLGI " 
GAFGEVCLARKVDTKALYATKTLRKKDVLL 
RNQ VA1 IVKAERDILAEADNE WVVRL YYSFQ 
DKDNLYFVMDYDPGGDMMSLLIRMGIFPESL 
ARFY1AELTCAVESVHKMGFIHRDIKPDNILID 
RDGHIKLTDFGLCTGFRWTHDSKYYQSGDHP 
RQDSMDFSNEWGDPSSCRCGDRLKPLERRAA 
RQHQRCLAHSLVGTPNYIAPEVLLRTGYTQL 
w w b v It VJLr bML V Gy PPFLAQTPLETQM 
KVINWQTSLHTPPQAKLSPEASDLUKLCRGPE 
DRLGKNGADEIKAHPIF*NQFDFSQ+PEDSRS 
AFKQFP*NHTTPTDTSNFDP\VDPDKLWSDDN 
ccjjj,\ yisu i iviNvjW i IviN UrUrtr iiriAr Y crTFRRF 
FDDNGYPYNYPKPIEYEYINSQGSEQQSDEDD 
QNTGSEIKNRDLVYV 


517 


1867 


A 


3980 


1358 


1022 


FFFKKFTQSLGFLLFSFSFLFSCFFFFHFVLFCY " 
VFLDRVPLCHPGWSAVVQSQVTA^NLPPSWD 
♦RCRPPH/I^LCNFCRDVSFTTLPRLVLNTWA 
QAIFQPQPPKVLGLQV 


518 


1868 


A 


3986 


974 


666 


SPEMESHPITQAGVQWHHLSSLQPLPPGPK*F " 
SCFSLPE*LGYRHVPPCLANSVFSVEMG\FLH 
VGQAGLEaTSGDLPALASQSAGITGVSHRAR 
PENGFENIF 


519 


1869 


A 


3994 


751 


126 


NQGLRHVGLCRTCLVNQiMFASSlLGKSHHHS 

LISINQGHNALWKAAG\PLPLKAGYC\QSFSPC 

DSLKYG\SWDEKDLTVPQEU)THBCRSVLRWIS 

QRGK\LAVEMEEGHCLL\LPLGTECLGIK\P1V 

HLFSSEMGE\NRPMVG\ARHVYSNAALLSFTP 



202 



WO 01/57188 



PCT/US01/03800 



SEQ ED 
NO: of 
nucl- 
eotide 
seq- 
uence 


SEQ ID 
nu: oi 
peptide 
seq- 
uence 


Met 

UUu 


SEQ 
ID NO: 
in 

USSN 
09/496 
914 


Predicted 

nucleotide 
location 
correspond! 
ng to first 
amino acid 
residue of 
peptide 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue 
of peptide 
sequence 


Amino acid sequence (A=Alanine OCysteine, 
D-Aspartic Acid, E«Glutamic Acid, 
F=Phenylalanine, G=Grycine, H=Histidine, 
I=Isolcucine, K=Lysine, L=Lcucine, 
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LRCLGGEKHKSGLHARPVIVPSLELHYDMDS1 
AHWADLLLUTLPSYYIPFC 


520 


1870 


A 


3999 


882 


698 


QSFRLSLLSSWDYRHM*PRLANF*T\FFCRDR/ 
SL ALLPRL V SN S WPQ AILPPRPPK VLGLQT 


521 


1871 


A 


4011 


1346 


1178 


FFF*ETVSCSAS*AGVRSHDNSSLQPPSFU\t>SN 
PPTSASHVAGATGTHHHAWLLSV 


522 


1872 


A 


4015 


2 


377 


QGIAULTRMGESVKHVTGGYKLRTRPLEFAA 
IGD YLDTF ALKLGTIDRIAQRI IKEEIEYLVELR 
EYGPVYSTWSALEGELAEPLEGVSACIGNCST 
AL*ELTDDMTEDFLFVLREYIL YSD SMK 


523 


1873 


A 


4018 


341 


19 


ERVTHNQIQQAQRSPHIFNARRSS/PRPNIVtLP 
KVKEVOCTSKS/GQVIYKGVSIRLRANFLAEP 
L*NRREWDEAIKVLKEKQ\FLSKMVYPANLSF 
GNEGDITSFPAK 


524 


1874 


A 


4020 


1067 


743 


FFLRWSL/DSVAQAGVKWCNLGSLQAPFPUF 
TPFSCLSLPSSWDYRHPPPRLAN*LTNFLCF** 
RQGFTVLARMVLIS*PHDLPASASQSAGITGL 
SHCSWPTSSILS 


525 


1875 


A 


4021 


781 


351 


QFRVIFFFLRRSHSVAQAGMQWHDHSLLQPL 
PPRLKQ/F/SHLSPPSIWDYRRVPPCLVNFSIFF 
VETGSCQPCLQLLGSSNPPASASQSAGIAGISH 
QGQPE* SFDIRFAC VI AALRETFQCLCSASRVN 
NKIINRPTHPVESSF 


526 1 


1876 j 


A 


4024 


80 


341 


TPSSTSRGTEEQQSSKMAWQRREEKEHLNVR 
RSSAEDGWKADKP/VDG*TPGEDHLPTPSPFQ 
LHIHS SESQLHHS VKSPPSLSFRLM 


527 


1877 


A 


4026 


593 


230 


DFYLYPERKKRGQMMTAVSLTTRPQESVAFE 
DVAVYFTTKEWAIMG\PAERALYRDVMLEN 
YGGCGPL* CHPTSKPAL VFSULEQGKESCFSPA 
TGSSLSRNDWRAGWIGYLELRRYTYLS 


528 


1878 


A 


4028 


1160 


242 


" GTSELLCIQRWNWGPAFPPRPGLALAF1XQLL 
VEMGSAKSVPVTPARPPPHNKHLARVADPRS 
PSAGDLRTPIQVESSPQPGLPAGEQLEGLKHAQ 
DSDPRSFTLGIARTPMKTSSGDPPSPLVKQLSE 
VFETEDSKSNLPPEPVLPPEAPLSSELDLPLGT 
QLSVEEQMPPWNQTEFPSKQVFSKEEARQPT 
ETWASQSSDKPSRDPETPRSS\GSMRNRWKP\ 
NSSKVLVGKSPLHPSCQDDNSPGTLTLRQGKA 
AFKPLSENVSELKuiGAVILGTGRVLLKTEGRA 
WEQGQDXHDKENQHFPLVES 


529 


1879 


A 


4039 


2 


366 


KDMVLIMEMQSMITMKCPQYL*E*RKIPD1TK. 
CW*GCGSTGIUFC/WS*PL*KTI*QPR+FKQI*T 
ILTIIYSIM*EHTFHNAGV*LSDIYPRFMKGYV 
HTEICT*MFIAVLFVVVKTWKQF 


530 


1880 


A 


4057 


358 


3 


LLEVNG>nriVTVFTKAQNKKNKGSRSlLFKQL 
RKYGSRINLLKSKHDKMCTENYK'PMKEIEA 
/T)TDKWKDILCSWIRRIHMKDILCSWIGRTHV 
VKISILPKV>rrTlFVnLISIKnMAI 


531 


1881 


A 


4061 


50 


278 


TQGTEEIYKISSCEWVQASFSTPLITLHDFKiy 
HKATVlKMVWYWHRQ*KFSKN/RIESSEiEPH 
IYDQFIFDKGEKIIQEKGNSFFNN/MCWKNWIF 
T+KR 


532 


1881 


A 


4069 


19 


368 


NDLLENFKFWE i FKE*LENINGTVTEKETGGV 
YKELSSPKYSGTRQFYGQTISNFPGKIISMVY 
KliFQNTE/IEGRHPISLYEFRITLITIPNKDNrYL 
OIWMPVSLMNIVTLKCPT 


533 


1883 

i 
i 


A 


4076 


1 


355 


" PIRKJFTKVAG*KSNTPK*LAFLH1NNEQFENKJ/ 
ITNI/PFIIASKRIKYSGISLTKEMICDLYTETLLR 
KIKEDTNKWKDI/SCTWVGR/LNTVKMPKAaC 
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534 


1884 


A 


4088 


3 


1931 


IIDSSTRRMESERSPLYRQLIDLGYLSSSHWNC 

GAPGQDTKAQSMLVEQSEKLRHLSTFSHQVL 

QTRLVDAAKALNLVHCHCLDIFINQAFDMQR 

DLQriTKRLEYTRKKENELYESLMNIANRKQE 

EMKJ^MIVETLNTMKEELLDDATNMEFKDVI 

VPENGEPVGTREDCCCIRQIQELnSRLNQAVA 

NiaiSSVDYLRESFVGILERCLQSLEKSQDVS 

VHITSN YLKQILNAA YH VEVTFHSGS S VTRM 

LWEQDCQ1IQRITWVSPPAITLEWKRKVAQEAI 

ESLSASKLAKSICSQFRTRLNSSHEAFAASLRQ 

LEAGHSGRLEKTEDLWLRVRKDHAPRLARJLS 

LESRSLQDVLLHRKPKXGQELGRGQYGWYL 

CDNWGGHFPCALKSWPPDEKHWNDLALEF 

HYMRSLPKIIERLVDLHGSVIDYNYGGGSS1A 

VLLIMERLHRDLYTGLKAGLTLETRLQIALDV 

VEGIRFLHSQGLVHRDIKLKNVLLDKQNRAKI 

TDLGFCKPEAMMSGSIVGTPIHMAPELFTGK 

YDNSVDVYAFGILFWYICSGSVKLPEAFERCA 

SKDHLWNNVRRGARPERLPVFDEECWQLME 

ACWDGDPLKRPLLGrVQPMLQGIMNRJLCKS\ 

NSEQPNRGLDDST 


535 


1885 


A 


4090 


2 


417 


ALMPHEANYEEIFLKTDKDMDGFESGLEVRE 
IFLKTR/GLPSTLLAfflWALCDSKDCGKLSKD 
HFALAFHLmQKLlKGIDPPLVLTPEKlSPSNR 
ASLQKVTELTRKPVCHFKGTILWRiTDSIWMK 
HNRKRIWLRA , | 


536 


1886 


A 


4102 


569 


829 


DHQK*KNIPCSWIGRINIVKMSILPKAIYRFSAI 
P1K1PMTFFTEI*S*NVYRTTKTQE*AKAILSKK 
EQNLEESHYLDFK* YYRAV 


537 


1887 


A 


4104 


54 


281 


S1DCEHLIRRMLVLDPSKRLT1AQIKEHKWML 
IEVPVQRPVLYPQEQENEPSIGEFNEQVLRLM 
HSLGIDQQKTIE 


538 


1888 


A 


4109 


141 


314 


1RHIPLKIRSVVSHLKCFYKFILTFFFAGCSQPL 
VPRENITAWMNAIGLHTALPVS 


539 


1889 


A 


4111 


268 


1 


ASRPWGHSYP*FNQQEVDTLKRPIASSEI*MM 
I*KFAT\KKSPGPYRFTAEFSHTFKEDLVPILW 
PLFPKIYREGTLPHSFYEASITL 


540 


1890 


A 


4142 


198 


2064 


PliPGAGRAATPWGPLtWRGRGSGRCEkAAE 

AALGDFLGLHRRTQQPAVDRLLSDASAQWR 

VRGHGGVRESGRAPQQPGRRRGRRPRKRPR 

GRWRREGCGAGGRGVCVAAWSQRSIAGNN 

DYRLFHKMSNSHPLRPFTAVGEIDHVHILSEH 

1GALL10EE YGDV1F V VEKKRFPAHRVILAAR 

CQYFRALLYGGMRESQPEAEIPLQDTTABAFT 

MLLKYIYTGRATLTDEKEEVLLDFLSLAHKY 

GFPELEDSTSEYLCTILNIQNVCMTFDVASLY 

SLPKLTCMCCMFMDRNAQEVLSSEGFLSLSK . 

TAIXNIVLRDSFAAPEKI)IFLALLNWCKHNSIC 

ENHAE1MQA VRLPLMSL'iELiN V VRPSGJXSP 

DAILDAIKVRSESRDMDLNYRGMLIPEENJAT 

MKYGAQ^GELKSALLDGDTQNYDLDHG 

FSRHPIDDDCRSGIEIKlXjQPSIINHVRILLWDR 

DSRSYSYFIEVSMDELDWVRVIDHSQYLCRS 

WQKLYFPARVCmmiVGTHNTVNKlFHIVAF 

ECMFT>IKTFTLEKGLIVPMENVATIADCAS VI 

EGVSRSRNALLNGDTKNYDWDSGYTCHQLG 

SGAIWQLAQPYMIGSIRVLLWDCDDRSY 


541 


1891 


A 


4146 


282 


778 


GTLGYPNGARGQPQDNFFAHQWSHHPPISAC 



204 



WO 01/57188 



PCT/US01/03800 



SEQDD 
NO: of 
nucl- 
eotide 
seq- 
uence 


SEQID 
NO: of 
peptide 
seq- 


Met 
hod 


SEQ 
ID NO: 
in 

USSN 
09/496 
914 


Predicted 

beginning 

nucleotide 

location 

correspondi 

ng to first 

amino acid 

residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue 
of peptide 
sequence 


Amino acid sequence (A-= Alanine OOysteine, 
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J^SEMFAFWQDMKWKNKFWGKSLEIVPVG 
TVNYSLPRFGDHFEWNKVTSOHNVLSGQRW 
IEHYGEVLIKNTQDSSCHCKITFCKAKYWSSN 
VHEVQGAVLSRSGRVLHRLFGKWHEGLYRG 
PTPGGQCIWKP 


542 


1892 


A 


4147 


44 


433 


SVDAYVCND1VFS Y KT 1 J' LLLEGA*LTHRY V A 
QDPKQGQLRSLHLTCDSAPAGSQGTWSTSCR 
INHLIFRGGAQITFLATFDDSPKAVLGDRLLLT 
ANVSSENNTPRTSKTTFQLELSVKDAVYTW 

SSH 


543 


1893 


A 


4153 


678 


11 


TISYPQCLTQMYFLISFANVDTFLLPIMALDH 

YVAICSALQ*CSI1TP/ELCQGLPVLA*AGSSUS 

PVIlTVIMSIU^CSSAQISIinfRDAYLLMKIA 

CSHT*\NQHVFLGAWLFLAPCALILVSYIR1A 

AAILR1PSFTRRRKACSICSSHLSLVTLFYGTV 

LG1CI * PPDSFSAQDAIATIMYTVVTSMLNPFIY 

SLMNKEVQEAVRRLFSRGSHSSWCW 


544 


i ecu 


A 


*T 1 JO 


3 


538 


LLYAQAGVQ*LNLSSLQrQPAGLKQSSIIPSLP 
SSWDYRYSTPHPANFFVEMEFHHVAQAGLEL 
LGSGDLPTSTSHSAGITGVNSHHAPPRLISSEGS 
LLGHLLCLPMVFPLLCVFVLISSSLAGEEAAG 
LRVQKXWPAVVLSHLPVCWFHCSG1WSEVIE 
LKVGREGHVLPWQAHWEF 


545 


1895 


A 


4160 


1 


412 


HPLGLGLVPSE1FSPQDKKAADGSILAPARGE 

DLEAGLKGSFMDGRLQASVSVFRIQRVGSAM 

QDTASAMPCLPYYPTSHGFMAGGKSRSQGW 

ELELSGEPAPGWQVLAGYTYTQARYLRDASE 

ANVGQPLRPVDPR 


546 


1896 


A 


4174 


1252 


1190 


FFQVFIFLFLIFFKTEFHSCCPGAVQWHDLDSL 
QPPPPRFKGFSCLSLPSSWDYRHAPAHPANFV 
FLVETGFLHV\GQ\ASLELPTSGDTPAS\ASQSA 
GITGVSHHA*PRASGRRCW 


547 


1897 


A 


4176 


3029 


1 


AGPDGLAAJPASCQGARGQTRVPGAFSWLAP 

GSHHASEGLAPGVPPAGGVSAQELTAPPQEG 

WGLGAPPAAPRPESDEKRAGSDAVRSFSRGA 

RDSLGQRRLGGTRGAGPAGKGAQRTMGPAS 

GFHSFPPRPHQEPSPRSSCWQHLLWHCPWPQ 

PSRLFRLTPAQLLQGPGVLAAPPGP*HVPGFL 

AQSPWPLPSGPRSP*DPLHQGALVPLPQGGSP 

HTAPHCLPSVLSPA1QQPLLPTAST/SSRSPPAS 

TMAPIPSALAVWEPAGSSPQLSSAPADSS\PLP 

ALPKVLPPWTQKPLLGCLCQSPLPLLSPPDQI/ 

RCPPACSPAAASSFSFESQPCPSAPSKASPAPA 

ALVIVGPHHPP*SQQPQSQSVHPHGPGGPQPPL 

AASSLFWMFCQPPPPHPQFLWHRPLPVTCKA 

LASVPLCFRPAPGSLRQTPLPPQFHIPRPGLSAP/ 

PPPASGTSDSSDSRSPSASAARVWPPAVSPPPP 

AARHRPHPPEYFLSPCPFSCGFPRLLGRPRRPQ 

ALQTPRAWDLPPGSSPAPLCSGPELP*APPPLP 

PFPRVA*LGSGHPPSAQVPGLW*RCV*GHPIP 

RPVGHS*SGPPHSPPL*APPQAWPLELPPSRQC 

LQPLHLRAAQPLDPCCSLSPPGPPLPVPALPS 

WPGRP*SPSPASSQPPYHAGLPGPQSSPLPPGL 

PQLPSLRSGSQQPLLFFQCPGPGAVWGKGSPQ 

PLSPHPPPP/ARTQTFPVASRSLSPGTAPYSVCL 

TPSRSASSLPEWLASSLPKIPQSSGS\PLGPTSP 

MP*CFHRPSPPLP/LSSPFPA\LRPQAPQFPLHLP 

P*PPAPSPGCPLPPLAQQHQPSPPSPHARSTLT 

PPLWPSLALLP*PLPPPPPVPSFSASLLCSLPAH 
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ATPA^PfllA^crr nvDOTi mxriom/DDenuT a H 

PGTWQPW/PVSPAPLSCLSAWDPWELPSPQPQ 
VCSTAELPTSCIXSSPGP\PAFQPPRFGCL*GPP 
GPPGLPPLQSSLSFPPPPPPVPQPPAPPALQWG 
LHLPGGRTK 


548 


1898 


A 


4180 


2369 


844 


RIHREEDFQFILKGIARLLSNPLLQTYLPNSTK 

KIQFHQELLVLFWKLCDFNKVGQPRGALQGD 

GEQLPQ*PGGRDSVRLRGVGQSCPSLELSPLG 

PSPHP*KFLFFVLKSSDVLDILVPILFFLNDAR 

ADQSRVGLMHIGVFILLLLSGECNFGVRLNKP 

YSIRVPMDiPVFTGTHADLLmVFHBaiTSGHQ 

RLQPLFDCLLTIVVNVSPYLKSLSMVTANKLL 

HLLEAFSTTWRFSAAQNHHLVFFLLEVFNN1 

IQYQFDGNSNLVYAIIRKRS1FHQLANLPTDPP 

TIHKALQRRRRTPEPL SRTGS QGG APP WRAP A 

PLPLQSQ^SRPVWWLLQALTS*PRSPRCQR 

MAPCGPWNLSPSRAWRMAARLRGSPARHGG 

SSGDRP/HSSASGQWSFfPEWVLSWKSKLFLQ 

TIMRLLQVLVPQVEKICIDKGLTDESEILRFLQ 

HGTLVGLLPVPHPILIRKYQANSGTAMWFRT 

YMWGVIYLRNVDPPVWYDTDVKLFEIQRV 


549 


1899 


A 


4191 


858 


321 


LPWQRLGVLLSRGKMAVTGWLESLRTAQKT 

ALLQDGRRKVflYLFPDGKEMAEEYDEKTSE 

LL VRKWRVKS ALGAMGQ WQLE VGDPAPLG 

AGNLGPELIKESNANPIFMRKDTKMSFQWRIR 

NLPYPKDVYSVSVDQKERCnVRTTNKKYYK 

KFSIPDLDRHQLPLDDALLSFA\TPTAP 


550 


1900 


A 


4192 


1 


1980 


IRHTGSDTAGVCGWLLLSGPCGVGLDLDSRLL 

GASAMRRSEVLAEESIVCLQKALNI-ILREIWE 

LIGIPEDQRLQRTEWKKHIKELLDMMIAEEE 

SLKERLIKSISVCQKELNTLCSELHVEPFQEEG 

ETTILQLEKDLRTQVELMRKQKKERKQE\LKL 

LQEQDQELC\EILCMPHYDIDSASVPSLEELNQ 

FRQHVTTLRETKASRREEF/V S S KRQIILCME 

ELDHTPDTSFERDWCEDEDAFCLSLENIATAL 

QKLLRQ\LEMQKSQNEAVCEG\LRTQI\RELW 

DRLQIPEEEREAVATIMSGSKAKVRK\ALQ\LE 

VDRLEELEKCKTMKKVIEA1RVELVQYWDQC 

FYSQEQRQAFAPFCAEDYTESLLQLHDAEIVR 

LKNYYEYHKELFEGVQKWEETWRLFLEFER 

KASDPNRFTNRGGNLLKEEKQRAKLQKMLP 

KLEEELKARffiLWEQEHSKAFMVNGQKFME 

YVAEQWEMHRLEKERAKQERQLKNKKQTET 

EMLYGSAPRTPSKRRGLAPNTPG1CARKLNTT 

TMSNATANSSIRPIFGGTVYHSPVSRLPPSGSK 

PVAASTCSGKKTPRTGRHGANKENLELNGSI 

LSGGYPGSAPLQRNFSINSVASTYSEFADPSLS 

DSSTVGLQRELSKASKSDATSGILNSTNIQS 






A 

A 


4194 


3 


1008 


AWHEGLVSSPAIGAYLSASYGDSLWLVATV 

VALLDICFILVAVPESLPEKMRPVSWGAQISW 

KQADPFASLKKVGKDSTVLLMCITVCLSYLPE 

AG\QYSSFF\LYUl\QVIGFG\TVKIAAFIAMVGI 

LSIVAQTAFLSn^MRSLGNKNTVLLGLGFQML 

QLAWYGFGSQAWMMWAAGTVAAMSSITFP 

AISALVSRNAESDQQGVAQGIITGIRGLCNGL 

GPALYGF1FYMFHVELTELGPKLNSNNVPLQ 

GAVIPGPPFLFGACIVLMSFLVALFIPEYSKAS 

GVQKHSNSSSGSLTNTPERGSDEDDEPLLQDS 

SIWELSSFEEPGNQCTEL 
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552 


1902 


A 


4197 


2 


14302 


ARPPPAPGSRQQKQKAAPGAAAAAELRGAR 

EPAPARRRGTMAD GGEGEDEIQFLRTDDEW 

LQCTATIHKEQQKLCLAAEGFGNRLCFLESTS 

NSKNWPDLSICTFVLEQSLSVRALQEMIANT 

VEKSEGQVDVEKWKFMMKTAQGGGHRTIX 

YGHAILLRHSYSGMYLCCLSTSRSSTDKLAFD 

VGLQEDTTGEACWWTIHPASKQRSEGEKVR 

VGDDL1LVSVSSERYLHLSYGNGSLHVDAAF 

QQTLWSVAPISSGSEAAQGYLIGGDVLRLLH 

GHMDECLTVPSGEHGEEQRRTVHYEGGAVS 

VHARSLWRLETLRVAWSGSHIRWGQPFRLR 

HVTTGKYLSLMEDKNLLLMDKEKADVKSTA 

FTFRSSKEKLDVGVKKEVDGMGTSEIKYGDS 

VCYIQHVDTGLWLTYQSVDVKSVRMGSIQR 

KAIMHHEGHMDDGISLSRSQHEESRTARVTRS 

TVFLFNRFIRGLDALSKKAKASTVDLPIESVSL 

SLQDUGYFHPPDEHLEHEDKQNRLRALKNR 

QNT,FQEFX5M1NLVLECID31LHVYSSAAHFAD 

VAGREAGESWKSH^NSLYELLAALIRGNRKN 

CAQFSGSLDWLISRLERLEASSGILEVLHCVL 

VESPEAU41IKEGHIKSIISLLDKHGRNHKVLD 

VLCSLCVCHGVAVRSNQHLICDNLLPGRDLL 

LQTRLVNHVSSMRPNIFLGVSEGSAQYKKWY 

YELMVDHTEPFVTAEATHLRVGWASTEGYSP 

YPGGGEEWGGNGVGDDLFSYGFDGLHLWSG 

CIARTVSSPNQHLLRTDDVISCCLDLSAPS1SF 

RINGQPVQGMFENFNIDGLFFPVVSFSAGI1CV 

RJFLLGGRHGEFKFLPPPG Y APC YEA VLPKEKL 

KVEHSREYKQERTYTRDLLGPTVSLTQAAFT 

P1PVDTSQIVLPPHLERIREKLAENIHELWVMN 

KIELGWQYGPVRDDNKRQHPCLVEFSKLPEQ 

ERNYNLQMSLETLKTLLALGCHVGISDEHAE 

DKVKKMKLPKNYQLTSGYKPAPMDLSFIKLT 

PSQEAMVDKLAENAHNVWARDRIRQGWTY 

GIQQDVKNRRNPRLVPYTPLDDRTKKSNKDS 

LREAVRTLLGYGYNLEAPDQDHAARAEVCS 

GTGERFRIFRAEKTYAVKAGRWYFEFETVTA 

GDMRVGWSRPGCQPDQELGSDERAFAFDGF 

KAQRWHQGNEHYGRSWQAGDWGCMVDM 

NEHTMMFTLNGEILLDDSGSELAFKDFDVGD 

GFIPVCSLGVAQVGRMNFGKDVSTLKYFTIC 

GLQEGYEPFAVNTNRDITMWLSKRLPQFLQV 

PSNHEHIEVTRJDGTIDSSPCLKVTQKSFGSQN 

SNTDIMFYRLSMPIECAEVFSKTVAGGLPGAG 

LFGPKNDLEDYDADSDFEVLMKTAHGHLVP 

DRVDKI)KEATKPEFNNHKDYAQEKPSRLKQ 

RFLLRRTKPDYSTSHSARLTEDVLADDRDDY 

DFLMQTSTYYYSVRIFPGQEPANVV/VGWITS 

DFHQYDTGFDLDRVRTVTVTLGDEKGKVHE 

SIBCRSNCYMVCAGESMSPGQGRNNNGLEIGC 

WDAASGLLTFIANGKELSTYYQVEPSTKLFP 

AVFAQATSPNVFQFELGRIKNVMPLSAGLFK5 

EHKNPVPQCPPRLHVQFLSHVLWSRMPNQFL 

KVDVSRISERQGWLVQCLDPLQFMSLHIPEEN 

RSVDILELTEQEELLKFHYHTLRLYSAVCALG 

NHRVAHALCSHVDEPQLLYAIENKYMPGLLR 

AGYYDLLBDIHLSSYATARLMMNNEYIVPMT 

EEreSITLFPDENKKHGLPGIGLSTSLRPRMQF 

SSPSFVSISNECYQYSPEFPLDILKSKTIQMLTE 

AVKEGSLHARDPVGGTTEFLFVPT JKLFYTLLI 
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LEKELSVDDAKLQGAGEEEAKGGKRPKEGLL 

QMKLPEPVKLQMCIXLQYLCDCQVRHR1EAI 

VAPSDDFVAKLQDNQRFRYNEVMQALNMSA 

ALTARKTKEFRSPPQEQINMLLNFKDDKSECP 

CPEEIRDQaDFHEDLMTHCGIELDEDGSLDG 

NSDLTIRGRLLSLVEKVTYLKKKQAEKPVES 

DSKKSSTLQQLISETMVRWAQESVIEDPELVR 

AMFVLLHRQYDGIGGLVRALPKTYTINGVSV 

EDTINLL ASLGQIRSLL S VRMGKEEEKLMIRG 

LGD1MNNKVFYQHPNLMRALGMHETVMEV 

MVNVLGGGESKEUFPKMVANCCRFLCYFCR 

ISRQNQKAMFDHLS YLLENS S VGLASPAMRG 

STPLDVAAASVMDNNELALALREPDLEKWR 

YLAGCGLQSCQMLVSKGYPDIGWNPVEGER 

YLDFLRFAVFCNGESVEENANWVRLLIRRPE 

CFGPALRGEGGNGLLAAMEEAIKIAEDPSRD 

GPSPNS GSSKTLDTEEEEDDTIHMGNA1MTF Y 

SALIDLLGRCAPEMHLIHAGKGEAiRlRSILRS 

LIPLGDLVGV1SIAFQMPTIAKDGNWEPDMS 

AGFCPDHKAAMVLFLDRVYGIEVQDFLLHLL 

EVGFLPDLRAAASLDTAALSATDMALALNRY 

LCTAVLPLLTRCAPLFAGTEHHASLIDSLLHT 

VYRLSKGCSLTKAQRDS1EVCLLSICGQLRPS 

MMQHLLRRLVFDVPLLNEHAICMPLKLLTNH 

YERCWKYYCLPGGWGNFGAASEEELHLSRK 

LFWG1FDALSQKKYEQELFKLALPCLSAVAG 

ALPPDYMESNYVSMMEKQSSMDSEGNFNPQ 

PVDTSNITIPEKLEYFINKYAEHSHDKWSMDK 

LANGWIYGEIYSDSSKVQPLMKPYKLLSEKE 

KEIYRWPIKESLKTMLARTMRTERTREGDSM 

ALYNRTRRISQTSQVSVDAAHGYSPRAIDMS 

NVTLSRDLHAMAEMMAENYHNIWAKKKKM 

ELESKGGGNHPLLVPYDTLTAKEKAKDREKA 

QDBLKFLQINGYAVSRGFKDLELDTPSIEKRFA 

YSFLQQLIRYVDEAHQYILEFDGGSRGKGEHF 

PYEQEIKFFAKWLPLIDQYFKNHRLYFLSAA 

SRPLCSGGHASNKEKEMVTSLFCKLGVLVRH 

RISLFGNDATSIVNCLHILGQTLDARTVMICTG 

LESVKSALRAFLDNAAEDLEKTMENLKQGQF 

THTRNQPKGVTQIINYTTVALLPMLSSLFEHI 

GQHQFGEDLILEDVQVSCYRILTSLYALGTSK 

SIYVERQRSALGECLAAFAGAFPVAFLETHLD 

KHNIYSIYNTKSSRERAALSLPTNVEDVCPNIP 

SLEKLMEEIVELAESGIRYTQMPHVMEV1LPM 

LCSYMSRWWEHGPENNPERAEMCCTALNSE 

HM>TIlLGNILKirVN^GIDEGAWMKilLAVF 

SQPIINKVKPQLLKTHFLPLMEKLKKKAATVV 

SEEDHLKAEARGDMSEAELLILDEFTTLARDL 

YAFYPLLIRFVDYNRAKWLKEPNPEAEELFR 

MVAEVFmVSKSHNFKREEQNFVVQNEINN 

MSFLJTDTKSKMSKAAVSDQERKKMKRKGD 

RYSMQTSLIVAALKRLLPIGLNICAPGDQELIA 

LAKNRFSLKDTEDEVRDIIRSNIHLQGKLEDP 

AIRWQMALYKDLPNRTDDTSDPEKTVERVL 

DIANVLFHLEQKSKRVGRRHYCLVEHPQRSK 

KAVWHKLLSKQRKRAVVACFRMAPLYNLPR 

HRAVNLFLQGYEKSWIETEEHYFEDKL1EDLA 

KPGAEPPEEDEGTKRVDPLHQLILLFSRTALT 

EKCKLEEDFLYMAYADIMAKSCHDEEDDDG 
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* 










EEEVKSFEEKEMEKQKLLYQQARLHDRGAA 

EMVLQTISASKGETGPMVAATLKLGIAILNGG 

NSTVQQKMLDYLKEKKDVGFFQSLAGLMQS 

CSVLDLNAFERQNKAEGLGMVTEEGSGEKV 

LQDDEFTCDLFRFLQLLCEGHNSDFQNYLRT 

QTGNNTTVNinSTVDYLLRVQESISDFYWYY 

SGKDV1DEQGQRNFSKAIQVAKQVFNTLTEYI 

QGPCTGNQQSLAHSRL^-TJAVVGFLHVFAHM 

QMKLSQDSSQIELLKELMDLQKDMVVMLLS 

MLEGNVVNGTIGKQMVI^ILVESSNNVEMIL 

KFFDMFLKLKDLTSSDTFKEYDPDGKGVISK 

RDFHKAMESHKHYTQSETEFLLSCAETOENE 

TLDYEEFVKRFHEPAKDIGFNVAVLLTNI.SEH 

MPNDTRLQTFLELAESVLNYFQPFLGRIEIMG 

SAKRIERVYFEISESSRTQWEKPQVKESKRQFI 

FDWNEGGEKEKMELFVNFCEDT1FEMQLAA 

Q1SESDLNERSANKEESEKERPEEQGPRMAFF 

SILTVRS ALFALRYNILTLMRML SLKSLKKQM 

KKVKXNmTOMVTAFFSSYWSIFMTLLHFV 

ASVFRGFFRUCSLLLGGSLVEGAKKIKVAELL 

ANMPDPTQDEVRGDGEEGERKPLEAALPSED 

LTDLKELTEESDLLSDIFGLDLKREGGQYKLIP 

HNPNAGLSDLMSNPVPMPEVQEKFQEQKAK 

EEEKEEKEETKSEPEKAEGEDGEKEEKAKED 

KGKQKJLRQLHTHRYGEPEVPESAFWKKIIAY 

QQKLLOTFAR^IFYNMRMIALFVAFAINFGLL 

FYKVSTSSWEGKELFTRSSSENAKVTSLDSS 

SHRIIAVHYVLEESSGYMEPTVRILPILHTV1SF 

FCIIGYYCLKVPLVIFKREKEVARKLEFDGLYI 

TEQPSEDDIKGQWDRLVINTQSFPNNYWDKF 

VKRKVMDKYGEFYGRDRISELLGMDKAALD 

FSDAREKXKPKKDSSLSAVLNSIDVKYQMW 

KLGVVFTDNSFLYLAWYMT 


553 


1003 


A 


4199 


31 


767 


LPELNGRGAGLRRAEPSERGGGAERTQQVAA 
LPLSHGHSHGGGGCRCAAER/VGAARGSAAC 
AYGLYLRIDKGRLQCLNESREGSGRGVFKPW 
ERAD\DRSKFVESDADEELLFNIPFTG\HVKLK 
GIIIMGEDDDSHPSEMRLYKMPQMSFDDTER 
EPDQTFSLNRDLTGELEYATKISRFSNVYHLSI 
HISKNFGADTTKVFY1GUIGEWTELRRHEVTI 
CNYEASANPADHRVHQVTPQTHFIS 


554 


1904 


A 


4200 


1 


961 


GIPCTEMGNFDNANVTGEIEFAIHYCFKTHSL 

EICIKACKNLAYGEEKKKKCNPYVKTYLLPD 

RSSQGKRKTGVQRNTVDPTFQETLKYQVAPA 

QLVTRQLQVSVWHLGTLARRVFLGEVIIPLAT 

WDFEDSTTQSFRWHPLRAKADKYEDSVPQS 

NGELTVRAKLVLPSRTRKLQEAQEGTDQPSL 

HGQLCLVVLGAKNLPVRPDGTLNSFVKGCLT 

LPDQQKLRLKSPVLRKQACPQWKHSFYFSGV 

TPAQLRQSSLELTVWDQALFGMNDRLLGGT\ 

RLGSKGDTAVGGDACSQSKLQWQKVLSSPN 

LWTDMTLVLH 


555 


1905 


A 


4211 


331 


2419 

i 


KENKJCARNIJRMNQSRSRSDGGSEETLPQDH 

NHHENERRWQQERLHREEAYYQFINELNDE 

DYRLMRDHNLLGTPGEITSEELQQRLDGVKE 

QLASQPDLRDGTNYRDSEVPRESSHEDSLLE 

WLNTFRRTGNATRSGQNGNQTWRAVSRTNP 

NNGEFRFSLEIHVNHENRGFEIHGEDYTDIPLS 

DSNRDHTANRQQRSTNSPVARRTRSQTSVNFN 

GSSSNTPRTRT . ASRGQNPAEGSFSTJ A3RLRNGT 
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GG^GIPILAKAJniT^ 

gqrfgaahv^^gars>atvr}^ 

rlrstsnsrsrspiqrqsgtvyhnsqresrpv 

qqttrrsvrrrgrtrvfleqdrererrgtay 

tpfsnsrlvsrttveegeessrsstavrrhptit 

ldlqvr\rirpgenrdrdsianrtrsrvglae 

ntvtiesnsggfrrtisrlersgirtyvstitvp 

lrrisenelvepssvalrsilrqimtgfgelssl 

meadseselqrngqhlpdmhselsnlgtdn 

nrsqhregssqdrqaqgdstemhgeneitqp 

htrnsdsrggrqlrnpnnlvetgtlpojuah 

ffllnesddddrirgltkeqidnlstrhyehn 

SIDSELGKICSVCISDYVTGNKLRQLPCMffiF 
HIHCIDRWLSENCTCPICRQPVLGSN1ANNG 


556 


1906 


A 


4212 


3 


i 462 


LQRQRQHPAAAPAVPVRCFTFCFTDIVIMPKR 

KSPENTEGKDGSKVTKQEPTRRSARLSAKPA 

PPKPEPKPRKTSAKKEPGAKISRGAKGKKEEK 

QEAGKEGTAPSENGETKAEEIHISRSTVNVST 

SRGTPPSTLSVKGQIETVRVKGTEN 


557 


1907 


A 


4213 


774 


507 


ARRFSCLTLQTSWGHRH\GPPRP\ANFVFLVET 
GFLHIGQAGHKLPTSGDPPASASQSARrrGMS 
HRTWFLASFLIDSCKNFIVYK1MYTL 


558 


1908 


A 


4225 


3 


1253 


TYRHAEREHPET^SATKVSYDYRHKRPKLLD 

GDQDFSDGRTQKYCKEEDRKYSFQKGPLNRE 

LDCFNTGRGRETQDGQVKEPFKPSKKDSIAC 

TYSNKNDVDLRSSNDKWKEKKKKEGDCRKE 

SNSSSNQLDKSQKLPDVKPSPINLRKKSLTVK 

VDVKKTVDTFRVASSYS1ERQMSHDLVAVG 

RKSEN FHP VFEHLDSTQNTENKPTGEFAQEIIT 

I1HQVKAKYFPSPGITLHERFS\KMADIHKADV 

NEIPLNSDPEIHRKIDMSLAELQSKQAVIYESE 

QTLIKIIDPNDLRHDIERRRKERLQNEDEHIFHI 

ASAAERDDQNSSFSKNYTTQRKDIITHKPFBV 

EGNHRNTRVRPFKSNFRGGRCQPNYKSGLVQ 

KSLYIQAKYQRLRFTGPRGFrraKFRERLMRK 

KKVP 


559 


1909 


A 


4235 


1 


323 


KFSIPFFLRWSFTLV\PRLEGNDMISVHCNLGL 
LGLSHSPASASQVGGITGTQHHTGUFGFLIET 
EFHHVGQAGLELLTSGDPPALAFQSAGITGVS 
HHAWLQVLNS 


560 


1910 


A 


4246 


2 


1569 


TLSLLERVLMKDIVTPVPQEEVKTVIRKCLEQ 

AALVNYSRLSEYAKIEGKKREMYELPVFCLA 

SQVMDLT1QNQKDAENVGRLITPAKKLEDTIR 

LAELVIEVLQQNEEHHAEAFAWWSDLMVEH 

AETFLSLFAVDMDAALEVQPPDTWDSFPLFQ 

LLVNDFLRTGLLICGNGKAFHKHLQDLFAPLVV 

R/YMWDLDGSSPIAQSIHRGLLSRESWEPVNN 

GSGTSEDLFWKLDALQTFIRDLHWPEEEFGK 

HLEQRLKLMASDMIESCVKRTRIIAFEVKLQK 

TSSIQQffRVPQFNMAPCrTWM^^ 

KUCSMEMGQEFAKMWHQYHSKIDELIEETV 

KEMITLLVAKFVTILEGVLAKLSRYDEGTLFS 

SFLSFTVKAASKYVDVPKPGMDVADAYVTF 

VRHSQDVLRDKVNEEMYIERLFDQWYNSSM 

NVICTWLTDRMDLQLHrYQLKTLIRMVKKTY 

RDFRLQGVLDSTLNSKTYETIRNRLTVEEATA 

SVSEGGGLQGISMKDSDEEDEEDD 


561 


1911 


A 


4257 


1300 


654 


SELVQFLLIKDQKKIPIKRADILKHVIGDYKD1 
FPDLFKRAAERLQYVFGYKLVELEPKSNTYIL 
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INTLEPVEEDAEMRGDQGTPTTGLLMIVLGLI 
FMKGNTDCETEAWDFLLAL\GVYPTKKHLIFG 
DPKKLriBDFVRQRYLEYRRIPHTDPVDyH>Q 
WGPRTNLETSKMKVLKFVAKVHNQDPKDW 
PAQYCEALADEENRARPQPSGPAPSS 


562 


1912 


A 


4260 


1 


1498 


MVTWLYRFLPTSNMAAKLRSLLPPDLRLQF 

WLHARLQKCFLSRGCGSYCAGAKASPLPGK 

MAMGLMCGRRELLRLLQSGRRVHSVAGPSQ 

WLGKPLTTRLLFPAAPCCCRPHYLFLAASGPR 

SLSTSAISFAEVQVQAPPWAATPSPTAVPEV 

ASGETADVVQTAAEQSFAELGLGSYTPVGLI 

QNtLEFMHVDLGLPWWGAIAACTVFARCLIF 

PLTV^GQRE AARIHNHLPEIQKFS SRIREAKLA 

GDHIEYYKASSEMALYQKKHGIKLYKPLILPV 

TQAPIF1SFFIALREMANLPVPSLQTGGLWWF 

QDLTVSDPIYILPLAVTATMWAVLELGAETG 

VQSSDLQWMRNVKMMPLn'LPlTMHFPTAV 

FMYWLSSNLFSLVQVSCLRIPAVRTVLKIPQR 

VVHDLDKLPPREGFLESFKKGWKNAEMTRQ 

LREREQRMRNQLELAARGPLRQTFTHNPLLQ 

PGKDNPPNIPSS\SSSSSKPKSKYPWHDTLG 


563 


1913 


A 


4265 


623 


116 


MGGLAPTQTLEPT\REYQNTQLSVSYLLPEQN 
THGTRRTLSSGPSNNLPLPLSSSATMPSMjQCK 
HRSPNGGLr^QSPVK/TPPIPMSFQPVPGGV\L 
PRGSGNPPHGTSILTAPPALLPHPPTHPTQQSF 
LIQENNOTNHTHSHTHTYTETLSFFLYICVNN 
DRMEWGKSVF 


564 


1914 


A 


4270 


3 


368 


ILKRK1-SSLNSEVSTIQNTRMLAFKATAQLFIL 
GCTWCLGLtQVGPAAQVMAYLFTIINSLQGF 
FIFLVYCLLS\QQVQKQYQKWFREIVKSKSES 
ETYTLSSKMGPDSKPSEGDVFPRTSE 


565 


1915 


A 


4288 


83 


406 


RNSRPLWCSPPASQPRQAPVSQSCCCPLPSSSS 
PPSALLAPTKPRALGTLRLYECSPELCTTMLP 
PAWLLMLCQAPRPQDPDPRLTQPEKSLQEAP 
GQTGASRTPRT 


566 


1916 


A 


4298 


. 1041 


229 


LNSSQKLACUGVEGGHSLDSSLSVLRSFYVL 

GVRYLTLTFTCSTPWAESSTKFRHHMYTNVS 

GLTSFGEKWEELNRLGMMIDLSYASDTLIRR 

VLEVSQAPVIFSHSAARAVCDNLLNVPDDILQ 

LLKKNGGrVMVTLSMGVLQCNLLANVSTVA 

DHFDHIRAVIGSEFIGIGGNYDGTGRFPQGL\E 

DVSTYPVLIEELLSRSWSEEELQGVLRGNLLR 

VFRQVEKVREESRAQSPVBAEFPYGQLSTSCH 

FHLGASEWTPRLLIWR 


567 


1917 


A 


4299 


1 


1106 


GATPLGSVGGRTGKMDAATLTYDTLRFAEFE 

DFPETSEPVWDLGRKYSIFTEKDEILSDVASRL 

WFTYRKNFPAIGGTGPTSDTGWGCMLRCGQ 

MIFAQALVCRHLGRDWRWTQRKRQPDSYFS 

VLNAFIDRKDSYYSIHQIAQMGVGEGKSIGQ 

WYGPNTVAQVLKKI^VFDTWSSLAVHIAMD 

NTWMEEIRRLCRTSVPCAGATAFPADSDRH 

OTGFPAGAEVTNRPSPWRPLVLLIPLRLGLTD 

INEAYVETLKHCFMMPQSLGVIGGKPNSAHY 

FIGYVGEELIYLDPHTTQPAVEPTDGCFIPDES 

FHCQHPPCRMSIAELDPSIA WRGGI ELSTQAF 

G AECCLGMTRKTFGFLRFFF SMLG 


568 


1918 


A 


4300 


2012 


1843 


SRKFLTITPIVLYFLTSFYTKYDQIHFVLNTVS 
LMSVLIPKLPQLHGVRIFGINKY 


569 


1919 


A 


4302 


186 


531 


" "WTFCLFL/WWVPESARWLLTQGHVKEAHRY 
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iu^ob WGgpspsnwvp VMvrveiv jut 
CVWWGVGCVKCLPPRAHHIWQEKPLGPHRT 
VTESKLEAEGKTKEKAREKERKKKS 


570 


1920 


A 


4308 


3 


869 


RSGQGKVYGLIGRRRFQQMDVLEGLNLLITIS 

GKRNKLRVYYLSWLRNKILHNDPEVEKKQG 

WTTVGDMEGCGHYRVVKYERIKFLVIALKSS 

VBVYAWAPKPYHKFMAFKSFADLPHRPLLV 

DLTVEEGQRLKVEYGSSAGFHAVDVDSGNSY 

DIYIPVHIQSQITPHAOFLPNTDGMEMLLCYE 

DEGVYVNTYGRUKDVVLQWGEMPTSVAYIC 

SNQIMGWGEKAIEIRSVETGHLDGVFMHKRA 

QRLKFLCERNDKVFFASVRSGGSSQVYFMTL 

NRNCIMNW 


571 


1921 


A 


4309 


9 


524 


ASREMDVTKVCGEMRYQLNKTNMEKDEAE 
KEIiREFRAKTNRDLEIKDQEIEKLRIELiDESK 
QHLEQEQQKAALAREECLRLTELLGESEHQL 
HLTRQEKDSIQQSFSKEAKAQALQAQQREQE 
LTQKIQQMEAQHDKTENEQYLLLTSQNTFLT 
KLKEECCTLAKKLEQISQ 


572 


1922 


A 


4318 


1 


1119 


GATPLGSVGGRTGKMDAATLTYDTLRFAEFE 

DFPETSEPVWELGRKYSIFTEKDEILSDVASRL 

WFTYRXNFPAIGGTGPTSDTGWGCMLRCGQ 

MDPAQALVCRHLGRDWRWTQRKRQPDSYFS 

VLNAFIDRKDSYYSIHQIAQMGVGEGKSIGQ 

WYGPNTVAQVLKKLAVFDTWSSLAVHIAMD 

NTVVMEE1RRLCRTSVPCAGATAFPADSDRH 

CNGFPAGAEVTNRPSPWRPLVLLIPLRLGL\T 

DINEAYVXETLMCHCFHGWQFPG/WHREGK 

PNSAHYFIGYVGEELIYLDPHTTQPAVEP1DG 

CFIPDESFHCQHPPCRMSL^ELDPSIAVVRGGH 

LSTQAFGAECCLGMTRKTFGFLRFFFSMLG 


573 


1923 


A 


4333 


363 


3066 


GGVPVGLASKPFQILYGHTNEVLSVGISTELD 
MAVSGSRDGTVnHTIQKGQYMRTLRPPCESS 
LFLTIPNIAlSWEGHlWirSSTEEKTTLK\ERM 
HYICFSINGKYLGSQILKEQVSD1CIIGEHIVTG 
S1QGFLS1RDLHSLNLSINPLAMRLP1HCVCVT 
KEYSHILVGLEDGKLIWGVGKPAEVKPSISN 
FISHAVGDYFGSPSFQLffiKSPLGINKLKAKFD 
FSKGSK 


574 


1924 


A 


4346 


359 


1234 


MDTLEEVTWANGSTALPPPLAPN1SVPHRCLL 

LLYEDIGTSRVRYWDLLLLIPNVLFLIFLLWK 

LPSARAKIRITSSPIF1TFYILVFVVALVGIARA 

VVSMTVSTSNAATVADKILWEITRFFLLAIEL 

SVIILGlj\fGHLESKSSIKRVIAITTVLSLAYSV 

TQGTLEILYPDAHLSAEDFNIYGHGGRQFWL 

VSSCFFFLVYSLWILPKTPLKERISLPSRRSFY 

VYAGILALLNLLQGLGSVLLCFDIIEGLCCVD 

ATTFLYFSFFAPLIYVAFLRGFFGSEPKILF | 


j i j 




A 


4JOU 


2038 


1512 


GCWWRHPWLASQRDCLDCRIQLAEKFVKAV 

SKPSRPDMNPIRVKEVYRLEEMEKIFVRLEM 

KIIKGSSGTPKLSYTGRDDRHFVPMGLYIVRT 

VNEPWTMGFSKSFKKKFFYNKKTKDSTFDLP 

ADSIAPFHICYYGRLFWEWGDGIRVHDSQKP 

QDQDKLSKEDVLSFIQMHRA 


576 


1926 


A 


4365 


69 


500 


QYEGRQGREVKRTAWRISPVWRPARCRJRRST 
PQP/PE/PGAQQQERHRQGEAPMQALDPRAEP 
GPQAQSHAACQPEPEPPRVLLDPTAARGG V Q 
GRP/GLSRHPGLAPHPQTHTPWPQSGRLPCAS 
EPLPLGGIRPTPGLEPKGRDLM 
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nucleotide insertion 


577 


1927 


A 


4366 


785 


502 


sappkjo^gVLFlsprlkssgaiWVhstptlw 
assnsrastpkvagitgarphari1fvfliemg 
fhnvgqagl/dtltlvicppqppkllglqm 


578 


1928 


A 


4367 


1 


221 


FFFFLKKSRCVTQAGVQGNPISLHPPPPGFKRF 

SRLSIXSSWDYRHP/HAANFCBFSRDGWSPYW 

SGWSRTPDLR 


579 


1929 


A 


4383 


1 


224 


FETESHSVTQAGMQWHNLGSLQPMP/PGLKR 
FSCLRLQSSWDHRHAPPHLAHFCIFSRDGVSP 
CWPGWSSTPDLK 


580 


1930 


A 


4397 


410 


94 


SRLKPYSTNVTAKKLPATNIPNLDCFTAKLYQ 
^VFKKGI^IHILHELFQNKEEGAFPNS/FYEASFT 
LRPKSDRDIAKEESYSTISLLSTDTKILMSKYK 
QLKSSDL 


581 


1931 


A 


4414 


670 


3 


VLVHRQCGGILRLRRKEAVSVLDSADIEVTDS 

RLPHATIVDHRPQHRWLETCNAPPQLIQGKA 

RSAPKPSQASGHFSVELVRGYAGFGLTLGGG 

RDVAGDTPLAVRGLLKDGP\AQRCGRLEVGD 

LVLI IINGESTQGLTVI IAQAVERIRAGGPQLHL 

V1RRPLETHPGKPRGVGEPRKGWPSWPDRSP 

DPGGPEVTGSRSSSTSLVQHPPSRTTLKKTRG 

SPE 


582 


1932 


A 


4424 


194 


449 


VLYIRKKKRLEKLRHQLMPMYNFDPTEEQDE 
LEQELLEHGRDAASVQAATSVQAMQGKTTL 
PS\QGPLQRPSRLVFTVDVANAIHV 


583 


1933 


A 


4435 


1 


166 


APGPPVPPPGSPPEQMPGPCPASMPP/DPPPGS 
PPEQMPGPCPVSAPP/GPPPGSPPEQMPGPCPV 
SAPPALLQDTSV 


584 


1934 


A 


4439 


1 


628 


SATPQQPSAPQHQGTLNQPPVPGMDESMSYQ 

APPQQLPSAQPPQPSNPPHGAHTLNSGPQPGT 

APATQHSQAGPATGQAYGPHTYTEPAKPKK 

GQQLWNRMKPAPGT\EVSSSTSRSDPLLLPPR 

ALAPTQRASTVVLAPSPT/SEKVQNHSGSSAR 

GNLSGKPDDWP/LGHERVCGALLHRL+VGGG 

QGPHGKAAQGGAAGAAAGRLGLYH 


585 


1935 


A 


4463 


10 


144 


HKJPVTNSRDTQEVPLEKAK.QVLKIIATFKHTT 
S1FDDFAHYEKRQ 


586 


1936 


A 


4464 


1309 


103 


LNAESYVSFTTKJLDIPTAAKYEYGVPLQTSDS 

FLRFPSSLTSSLCTDNNPAAFLVNQAVKCTRK 

INLEQCEEIEALSMAFYSSPEILRVPDSRKKVPI 

TVQSIVIQSLNKTLTRRED1DVLQPTLVNAGH 

FSLCVNV\TLEVKYSLTYTDAGEVTKADLSFV 

LGTVSS WVPLQQKFEIHFLQENTQPVPLSGN 

PGYVVGLPLAAGFQPHKGSGIIQTTNRYGQLT 

ILHSTTEQDCLALEGVRTPVLFGYTMQSGCK 

LRLTGALPCQLVAQKVKSLLWGQGFPDYVA 

PFGNSQGP/ADMLDWVPEHFITQSFNRKDSCQ 

LPGALV1EVKWTKYGSLLNPQAKIVNVTANLI 

S S S FPEAN S GNERTILISTAVTF VD VS AFAE AG 

FRAPP AINARLPFNFFFFFV 


587 


1937 


A 


4471 


614 


387 


LLGRASAC/LQLQSSW/D/HRPMLPYLANFVF 

CKDR/SFTWLPRLVLNSWLQVILLPWPPTGCD 

NKHEPPCPAIXKRHSGSI 


588 


1938 


A 


4480 


1720 


1458 


HDLGSLQPPPPGFKRFSCLSLPSSWDYRLMPP 
CPANFCDII/DFLVETGFHHVGQASHELLTSGD 
PPTSASQSAGITGMSYHTWFGES 


589 


1939 


A 


4487 


922 


332 


APVTT SPRVGQP W/RTALALRSLYRARPSLRC 
PPVELPWAPRRGHRLSPADDELYQRTR1SLLQ 
REAAQAMYIDSYNSRGFMINGNRVLGPCALL 
PHSWQWNVGSHQDITEDSFSLFWLLEPRIEI 
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\r\r\rn,Tfzri6ztD.m AbAitt ..i>VAtin/M>cT-«*n?i7n 

DTPNACATFNFLCHEGRVTGAALIPPPGGTSL 
TSLGQAAQ 


590 


1940 


A 


4492 


1 


472 


FFFFETESRSVAQAGVQWRDLGSLQAPPPGFT 

PFSCLSLPSSWDYRRPPLRPANFFVFLVBTGFP 

RFSRDGLDLLT/S/GDPPTSASQSAGITGVSHR 

ARPKRIGEPRRKCGNAWWPSTSLGDHRVTS 

VPIIQGGLPGPIRVAPSSAGQREASQGPPGR 


591 


1941 


A 


4495 


1444 


1116 


IAARFTL^TWNQLKRP\TMIDSIKKTR\YIYT 
MEYYADTERNEIMSF\AGTWVELEAIILSKLM 
LKDNWVEDT1PQGAVPCTATAEGMKRLLFAL 
EPWDSSCFPHPSSGV 


592 


1942 


A 


4496 


2 


919 


RTRPLFSGRPTRPVCTMSDERRLPGSAVGWL 

VCGGLSLLANAWGILSVGAKQKKWKPLEFL 

LCTLAATHMLNVAVPIATYSWQLRRQRPDF 

EWNEGLCKVFVSTFYTLTLATCFSVTSLSYHR 

MWNWCWPVNYRLSNAKKQAGHTVMGIWM 

GSFILSALPAVGWHDTSERFYTHGCRFIVAEI 

GLGFGVCFLLLVGGSVAMGVICTAIALFQTL 

AVQVGRQADI-IRAFTVPTIWEDAQGKRRSSI 

DGSEPAKTSLQTTGLVTT1VFIYDCLMGFPVL 

GPFSLADTHLSDLPYTWGDRDSGGACVM 


593 


1943 


A 


4506 


2 


193 


FFFEAESCSVPQAGVQRPDLGWLHAPPPVGSC 
HFPASASQVAGTTHARHHTQLIF\AFLVENGL 
C 


594 


1944 


A 


4507 


1327 


647 


KMAGGVRPLRGLRALCRVLLFLSQFCELSGG 

ESTEIPPYVMKCPSNGLCSRLPADCIDCTTNFS 

CTY GKPVTFDCAVKPSVTCVDQDFKSQKNFII 

NMTCRFCWQLPETDYECTNSTSCMTVSCPRQ 

RYPANCTVR\DHVHCLGNRTFPKMLYCNWT 

GGYKWVYGLWLLRHHPRWGLGADRRYLGP 

VAGTASGKLFSFGGLG1WTLIDVLLIGVGYVG 

PADGSLYI 


595 


1945 


A 


4512 


533 


264 


KFFKMESVSVARLECSGAISAPCNLHLLGSNN 
SPASASRV/AGNIGARHHTQQIFVLLVQMRVH 
YVGQDGLDLL/NLMIHPPRSPKVLGLQA 


596 


1946 


A 


4513 


3 


1674 


HASDFILYPNFLVNELILKQKQRFEEKRFKLD 

HSVSSTNGHRWQIFQDWLGTDQDNLDLANV 

hO^LELLVQKKKQLEAESHAAQLQILMEFLK 

VARRNKREQLEQIQKELSVLEEDIKRVEEMS 

GLYSPVSEDSTVPQFEAPSPSHSSIIDSTEYSQP 

PGFSGSSQTKKQPWYNSTLASRRKRLTAHFE 

DLEQCYFSTRMSRISDDSRTASQLDEFQEOLS 

KF\TRYNSVRPL\A'1XSYASDLYNGSQYKSLV 

FEFDRDCDYFAIAGVTKKIKVYEYDTVIQDA 

VDIHYPENEMTCNSKISCISWSSYHKNLLASS 

DYEGTVILWDGFTGQRSKVYQEHEKRCWSV 

DFNLMDPKLLASGSDDAKVKLWSTNLDNSV 

ASIEAKANVCCVKFSPSSRYHLAFGCADHCV 

HYYDLRNTKQPIMVFKGHRKAVSYAKFVSG 

EEIVSASTDSQLKLWNVGKP\YCLRSFKGHIN 

EKNFV\GLASNGDY1A0GSENNSLYLYYKGLS 

K1LLTFKFDTVKSVLDKDRKEDDTNEFVSAV 

CWRALPDGESNVLIAANS\QGTI\KVLELV 


597 


1947 


A 


4518 


536 


824 


RSLALSPGLECSGM1SAHCNLHLLGSSDPP'1 , S 
ASQVAEITSVRHHTWUFCWLGQMGFHHVGE 
QAGLELLTSWDPAILPSQSAGIIGMSPHAWPP 


598 


1948 


A 


4524 


1 


384 


FDTEFVNIGGDFDAAAGVFR\CRLPGAYFFSF 
TLGKLPRKTLSVKLMKNRDEVQAMIYDDGSS 
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RRREMQSQSVMLALRRGDAVWLLSHDHDG 
YGAYSNHGKYITFSGFLVYPDLAPAAPPGLG 
ASELL 


599 


1949 


A 


4526 


366 


776 


MGQPAPYAEGPIQGGDAGELCKCDFLVFTSP 
NPEAVCEAGTPAMFQTAWRQMESCSI/AQAG 
VQWRDPGSI^PPLGFKOTSCLSLPSSWDYK 
HAPPHPANFCIFSRDQVSPCWPGWSRSLDLVT 
PPPWLPKVLGLQA 


600 


1950 


A 


4529 


776 


334 


FFFETESCYVAQAGVQWCDLCSLQAPPPGVSS 
DPPASASRVAGTTGARHHTQLIFVFLVETGFH 
\MLARDGLKLLTSSDPPASASQSSWDYRREPP 
RLANFFVFLVETGSRYVAQAGVQWLFTGAIP 
LLISTGVLTCSVSDLGRFTPP 


601 


1951 


A 


4533 


1460 


403 


HEVQESIHFLESEFSRGiSDNYTLALlTYALSS 
VGSPKAKEALNMLTWRAEQEGGMQFWVSSE 
SKLSDSWQPRSLDffiVAAYALLSHFLQFQTSE 
GIPIMRWLSRQRNSLGGFASTQDTTVALKALS 
EFAALMNTERTNIQVTVTGPSSPSPVKPLIDT 
HNRLLLQTAELADGTANGSV/SISANGFGFAI 
CQLNWYNVKASGSSRRRRSIQNQEAFDLDV 
AVKENKDDLNHVDLNVCTSFSGPGRSGMAL . 
MEVNLLSGFMVPSEAISLSETVKKVEYDHGK 
LNLYLDSVNETQFCVNIPAVRNFKVSNTQDA 
SVSIVDYYEPRRQAVRSYNSEVKLSSCDLCSD 
. VQRLPSL 


602 


1952 


A 


4540 


1963 


295 


MRAPGRPALRPLPLPPLLLLLLSSPWGRAVPC 

VSGGLPKPANITFLSINMKNVLQWTPPEGLQG 

VKVTYTVQYF1YGQKKWLNKSECRNINRTYC 

DLSAEfijDYEHQYYAKVKAIWGTKCSKWAE 

SGRFYPFLETQIGPPEVALTTDEKSISWLTAP 

EKWKRNPEDLPVSMQQIYSNLKYNVSVLNT 

KSNRTWSQCVTNHTLVLTWXLEPNTLYCVHV 

ESFVPGPPRRAQPSEKQCARTLKDQSSEFKAK 

IIFWY\a,PISrrVFLFSVMGYSIYRYIHVG\KEK. 

HPVANLILrVTG\NEFDKRFFVPA\EKIV\INFI\TL 

NISVDDSKJSHQDMSLLGKSSDVSSLNDPQPSG 

NLRPPQEEEEVKHLGYASHLMEIFCDSEENT\ 

EGTSFTQQESLSRTBPPDKTVIEYEYDVRTTDI 

CAGPEEQELSLQEEVSTQGTLLESQAALAVL 

GPQTLQYSYTPQLQDLDPLAQEHTDSEEGPEE 

EPSTTLVDWDPQTGRLCIPSLSSFDQDSEGCE 

PSEGDGLGEEGLLSRLYEEPAPDRPPGENETY 

LMQFMEEWGLYVQMEN 


603 


.1953 


A 


4543 


3 


600 


YSAVEFVEQASGISDWWNPALRKRMLSDSGL 

GMIAPYYEDSDLKDLSHSRVLQSPVSSEDHAI 

LQAVIAGDLMKUESYKNGGSLLIQGPDHCSL 

LHYAAETGNGEIVKYILDHGPSELLDMADSE 

TGETALHKAACQRNRAVCQLLVDAGASLRKN 

TDSKGKTPQERAQQA\GDPDLAA/YTIESRQN 

YKVIGHEDLETAV 


604 


1954 


A 


4548 


3 


938 


QDNKVQNGSLHQKDTVHDNDFEPYLTGQAN 

QSNSYPSMSDPYLSSYYPPSIGFPYSLNEAPW 

STAGDPPIPYLTTYGQLSNGDHHFMHDAVFG 

QPGGLGNNIYQHRFNFFPENPAFSAWGTSGS 

QGQQTQSSAYGSSYTYPPSSLGGTWDGQPG 

FHSDTLSKAPGMNSLEQGMVGLKIGDVSSSA 

VKTVGSWSSVALTGVLSGNGGTNVNMPVS 

KPTSWAAIASKPAKPQPKMKTKSGPVMGGG 

LPPPPIKHNMDIGTWDNKGPVPKAPVPQQAP 
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SPQAAPQPQQVAQPy?A.QPpALAQPQYQSPQ 
QPPQ 


605 


1955 


A 


4553 


2 


2304 


E.LQEKRNCLLMQLEEATRLTSYLQSQLKSLC 

ASTLTVSSGSSRGSLASSRGSLASSRGSLSSVS 

FTDIYGLPQYEKPDAEGSQIXRFDLIPFDSLGR 

DAPFSEPPGPSGFHKQRRSLDTPQSLASLSSRS 

SLSSLSPPSSPLDTPFLPASRDSPLAQLADSCE 

GPGLGALDRLKAHASAMGDEDLPGMAALQP 

HGVPGDOEGPHERGPPPASAPVGGTVTLRED 

SAKRLERRARRISACLSDYSLASDSGVFEPLT 

KRNEDAEEPAYGDTASNGDPQIHVGLLRDSG 

SECLLVHVLQLKNPAGLAVKEDCKVHIRVYL 

PPT,nSGlTOTYCSKALEEQVPLVFNEVFRIPV 

HSS AL1UCSLQL YVCS VTPQLQEELLG1AQIN 

LADYDSLSEMQLRWHSVQVFTSMNHQGRGR 

LGVQERAPPGTLHTPSPSPA/STDAYTVLLAR 

TTAQLQ A VEREL AEERAKLE YTEEE VLEMER 

KEEQAEAISERSWQADSVDSGCSNCTQTSPPY 

PEPCCMGIDSILGHPFAAQAGPYSPEKFQPSPL 

KVDKETNTEDLFLEEAASLVKERPSRRARGSP 

FVRSGTIVRSQTFSPGARSQYVCRLYRSDSDS 

STLPRKSPFVRNTLERRTLRYKQSCRSSLAEL 

MARTSLDLELDLQASRTRQRQLNEELCALRE 

LRQRLEDAQLRGQTDLPPWVLRDERLRGLLR 

EAERQTRQTKLDYRHEQAAEKMLKKASKEI 

YQLRGQSHKEPIQVQTFREKIAFFTRPRINIPPL 

PADDV 


606 


1956 


A 


4555 


3429 


776 


PGSGPGPAPFLAPVAAPVGGISFHLQIGLSREP 

VLLLQDSSGDYSLAHVREMACSIVDQKFPEC 

GFYGMYDKILLFRHDPTSENILQLVKAASDIQ 

EGDLIEWLSASATFEDFQIRPHALFVHSYRA 

PAFCDHCGEMLWGLVXRQGLKCEGCGLNYH 

KRCAFKIPNNCSGVRRRRLSNVSLTGVSTIRT 

SSAELSTSAPDEPLLQKSPSESFIGREKRSNSQ 

SYIGRPDiLDKILMSKVKVPHTFVIHSYTRPTV 

CQYCKKLLKGLFRQGLQCKDCRFNCHKRCA 

PKVPNNCLGEVTINGDLLSPGAESDWMEEG 

SDDNDSERNSGLMDDMEEAMVQDAEMAMA 

ECQNDSGEMQDPDPDHEDANRTISPSTSNNIP 

LMRVVQSVKHTKRKSSTVMKEGWMVHYTS 

KDTLRKRHYWRLDSKC3TLFQNDTGSRYYKE 

IPLSEILSLEPVKTSALIPNGANPHCFEnTANV 

VYYVGENVVNPSSPSPNNSVLTSGVGADVAR 

MWE1AIQHALMPVIPKGSSVGTGTNLHRD1SV 

SISVSNCQIQENVDISTVYQIFPDEVLGSGQFGI 

VYGGKHRKTGRDVAIKIIDKLRFPTKQESQLR 

NE V AILQNLHHPG WNLECMFETPERVF WM 

EKLHGDMLEMILSSEKGRLPEHITKFLITQILV 

ALIUJIJHFKNIVHCDL^^ 

KLCDFGF ARIIGEfCSFRRS V VGTPA YL APE VL 

RNKGYNRSLDMWSVGVHYVSLSGTFPFNED 

EDIHDQIQNAAFMYPPNPWKEISHEAIDLINN 

LLQVKMRKRYSVDKTLSHPWLQDYQTWLDL 

RELECKIGERY1THESDDLRWEKYAGEQGLQ 

YPTHLINPSASHSDTPETEETEMKALGERVSIL 


607 


1957 


A 


4563 


1 


4499 


SRPWWLRASER^SAPSAWAKRSRGPGRkCflT 

ALVLFCAWGTLAWAQKPGAGCPSRCLCFRT 

TVRCMHLLLEAVPAVAPQTSILDLRFNRIREI 

QPGAFRRLRNLNTLLLNNNQIKRIPSGAFEDL 

ENLKYT^TYKNFJQSIDRQAFKGLASLEQLYL 
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HFNQIETLDPDSFQHLt*KLERLFLHNNR]THL 

VPGTFNHLESMKMJU^DSNTLHCDCEILWLA 

DLLKTYAESGNAQAAA1CEYPRRJQGRSVATI 

TPEELNCBRPRITSEPQDADVTSGNTVYFTCR 

AEGNPKPEHWLRNNNELSMKTDSRLNLLDD 

GTLMIQNTQETDQGIYQCMAKNVAGEVKTQ 

EVTLRYFGSPARPTFVIQPQNTEVLVGESVTL 

ECSATGHPPPR1SWTRGDRTPLPVDPRVNITPS 

GGLYIQNWQGDSGEYACSATNNIDSVHATA 

FIIVQALPQFTVTPQDRWIEGQTVDPQCEAK 

GNPPPVIAWTKGGSQLSVDRRHLVLSSGTLRI 

SGVALHDQGQYECQAVNIIGSQKWAHLTVQ 

PRVTPVFASIPSDTTVEVGANVQLPCSSQGKF 

EPA1TWNKDGVQVTESGKFHISPEGFLTINDV 

GPADAGRYECVARNTIGSASVSMVLSVNVPD 

VSRNGDPFVATSIVEAIATVDRAINSTRTHLF 

DSRPRSPNDLLALFRYPRDPYTVEQARAGEIF 

ERTLQLIQEHVQHGLMVDLNGTSYHYNDLVS 

PQYLNLIAN^GCTAHRJlVNNCSDMCFIiQKY 

RTHDGTCNNLQHPMWGASLTAFERLLKSVY 

ENGFNTPRGINPHRLYNGHALPMPRLVSTTL1 

GTETVXPDEQFTHMLMQWGQFLDHDLDSTV 

VALSQARFSDGQHCSNVCSNDPPCFSVMIPPN 

DSRARSGARCMFFVRSSPVCGSGMTSLLMNS 

VYPREQINQLTSYIDASNVYGSTEHEARSIRD 

LASHRGLLRQGIVQRSGKPLLPFATGPPTECM 

RDENESPIPCFLAGDHRANEQLGLTSMHTLW 

FREHNRIATELLKLNPHWDGDTIYYETRKIVG 

AEIQIJITYQHWLPKILGGVGMRTLGEYHGYD 

PGINAGIFNAFATNAAFRFGHTLVNPLLLPGLD 

ENFQPIAQDHLPLHKAFF SPFRTVNEGGIDPLL 

RGLFGVAGKMRVPSQLLNTELTERLFSMAHT 

VALDLAAINIQRGRDHGIPPYHDYRVYCNLS 

AAHTFEDLKNEIKNPEIRKCLKRLYGSTLNID 

LFPALWEDLVPGSRLGPTLMCLLSTQFKRLR 

DGDRLWYENPGVFSPAQLTQIKQTSLARILCD 

NADNTTRVQSDVFRVAEFPHGYGSCDEIPRVD 

LRVWQDCCEDCRTRGQFNAFSYHFRGRRSLE 

FSYQEDKPTKKTRPRKIPSVGRQGEHLSNSTS 

AVFSTRSDASGVTNDFQRVCSWEMQKTITDLR 

TQIKKLESRVLSTTECVDAGGESHANNTXWK 

KDACTICECKDGQVTCFVEACPPATCAVPVN1 

PGACCPVCLQKRAEEKP 


608 


1958 


A 


4566 


354 


1135 


FSFLC/GVSGRLGLDSEEDYYTPQKVDVPKAL 
IIVAVQCGCDGTFLLTQSGKVLACGLNEFNKL 
GLNQCMSGIINHEAYHEVPYTTSFTLAKQLSF 
YKIRTIAPGKTHTAAIDERGRLLTFGCNKCGQ 
LGVGNYKKRLGINLLGGPLGGKQVIRVSCGD 
EFTIAATDDNHIFAWGNG GNGRLAMTPTERP 
HGSDICTSWPRPIFGSLHHVPDLSCKGWHTILI 
VEKVLNSKTIRSNSSGLSIGTVFQSSSPGGGGE 
GGPDAW 


609 


1959 


A 


4567 


1 


412 


FFFFETESRSVAQAGVQWRDLGSLQAPPPGFT 

PFSCLSLPSSWDYRRPPLRPANFFVFLVETGF 

HRFSRDGLDLLT/S/GDPPASASQSAGITGVSH 

RARPRINIJINVTYSFAVT^CLNYISLAMSSTL 

KLSFHVLSGS 


610 


1960 


A 


4570 


697 


467 


ECRGVISAHNCCTLCLPSSSDSASAI^RVARTT 
GTCD YAQLEFAFL VEMGFHHVG QD GLHLL/N 
LVIRPPRPPKVLGLQA 
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D-Aspartic Acid, E=Glutamic Acid, 
F-Phenylalanine, G=Glycinc, H=Histidine, 
Hsoleucine, K=Lysine, D=Leucine, 
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611 


1961 


A 


*57i 




1396 


ArMJUTTV/ldttPDA A .6 AWmrr nmn n<<r<Soi>r»T. 

WNPNVPESPRIPAPRLPKilMSGAPTAGAALM 
LCAATAVLLSAQGGPVQSKSPRFASWDEMN 
VLAHGlXQLOQQ\CANlNOAHPQSAERAOA\R 
LSACGSACQGTEGSTDLPLAPESRVDPEVLHS 
LQTQLKAQNSRIQQLFHKVAQQQRHLEKQHL 
WQHLQSQFGLLDHKJ^DHEVAKPARRKRLP 
EMAQPVDPAHNVSRLHRLPRDCQELFQVGER 
QSGLFEIQPQGSPPFLVNCKMTSDGGWTVIQR 
RHDGSVDFNRPWEAYKAGFGDPHGEFWLGL 
EKVHSITGDRNSRLAVQLRDWDGNAELLQFS 
VHLGGEDTAYSLQLTAPVAGQLGATTVPPSG 
, LSWFSTWDQDHDLRRDKNCAKSLSGGWWF 
GTCSHSNLNGQYFRSIPQQRQKLKKGIFWKT 
WRGRYYPLQATTMLIQPMAAEAAS 


612 


1962 


A 


4575 


162 


3 


FFFETESRS VAQAG VQ WRDLS SLQPPPPGNSR 
GSPAS ASFVAGITGTRHHRTRG 


613 


1963 


A 


4584 


687 


321 


PLAQRRPFLWVTVKTNGHIWGSSTYPHFWGS 
SNS/PASASQVAGIPNARHQARIIFWLVEPRF 
HHVGRAGLGFL/NLAICLPQHPKVLGLQACN 
LMmiPAHKYISMIQFNVHFMCMSVHIYI 


614 


1964 


A 


4589 


727 


299 


PGSAQSAQRGRGRRRARAGSATQITMYSFMG 

GGLFCAWVGTILLWM1ATDHWMQYRLSGS 

FAHQGLWRYCLGNKCYLQTDSIAYWNATRA 

FMILSALCAISGIIMGIMAF/GWVAVLMTFFA 

GIFYMCAYRVHECRRLSTPR 


615 


1965 


A 


4590 


2 


414 


T1LPEKIQAWAQKQCPQSGEEAVALWHLEK 

ETGRLRQQVSSPVHREKHSPLGAAWEVADFQ 

PEQVETQPRAVSREEPGSLHSGHQEQLNRKR 

ERRPLPKNARPSPWVPALADEWNTLHQEVTT 

TRLPAGSQEPVKD 


616 


1966 


A 


4592 


773 


488 


DFALVAQAGVQWHNLGSPQPLPPGFKRFSCL 
SLPSSWEYRCVPP/RLANFVFLVEMGFLHVGQ 
AGLELPTSGDPPALASQSAGTrGVTTVPSGPG 


617 


1967 


B 


4595 


14 


478 


XRHGLREPLLERRCAAASSFQHSSSLGRELPY 
DPVDTEGFGEGGDMQERFLFPEYILDPEPQPT 
REKQLQELQQQQEEEERQRQQRREERRQQNL 
RARSREHPWGHPDPALPPSGVNCSGCGAEL 
HOQDAR* ! 


618 


1968 


A 


4596 


2945 


1188 


ARSRNSARGVYGMCVDltFLCFLEDLERNDG 

S AERP YFMC STLKKPLARRCFP AIHA YKG VL 

MVGNETTYEDGHGSRKNITDLVEGAKKANG 

VLEARQLAMRIFEDYTVSWYWinGLVIAMA 

MSLLSilLLHLLAGIMGWVMliMEIVSELGYRIF 

HCYMEYSRJLRGEAGSDVSLVDLGFQTDFRV 

YLHLRQTWLAFMHLSILEVinLLLIFLRKRILI 

AIALIKEASRAVGYVMCSLLYPLVTFFLLCLCI 

A YWAST AVFL STSNEAVYK1FDDSPCPFTAKT 

CNPETFPSSNESRQCPNARCQFAFYGGESGYH 

RALLGLQIFNAJMFFWLANFVLALGQVTLAG 

AFASYYWALRKPDDLPAFPLFSAFGRALRYH 

TGSLAFGALHAIV QIIRVILEYLDQRLKAAEN 

KFAKCLMTCLKCCFWCLEKFIKFLNRNAYIM 

IAfYGTNFCTSARNAFFLLMRNIIRVAVLDKV 

TDFLFLLGKLLIVGSVGILAFFFFTHRIRIVQDT 

APPLNYYWVPHTVIVGSYLIAHGKFSVYGMC 

VDTLFLCFLEDLERNDGSAERPYFMSSTLKKL 

LNKTNKKAAES 


619 


1969 


A 


4601 


2 


357 


RTSVEPYILGEF/RKLSNNTKWKTEYKATEY 
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T=Threonine, V=Valine, W=Tryptophan, 
Y=Tyrosine, X=Unknown, *~Stop codon, 
/^possible nucleotide deletion, \= s possibIe 
nucleotide insertion 














GLA YGHFS YEF SNHRD WVDLQGWVTGNGK 














GLIYLTDPQIHSVDQKVFTTNFGKRGIFYFFN 














NQHVECNE1CHRLSLTRPSMEKPCKS 


620 


1970 


A 




1 


24 15 


MERLWGLFQRAQQLSPRSSQTVYQRVEGPR 

KGHLEEEEEDGEEGAETLAHFCPMELRGPEP 

LGSRPRQPNLIPWAAAGRRAAPYLVLTALLIF 

TGAFLLG YV AFRGS CQ ACGD S VL WSED VN 

YEPDLDFHQGRLYWSDLQAMFLQFLGEGRL 

EDTERQTSLRERVAGSAGMAALTQDIRAALS 

RQKLDHVWTDTHYVGLQFPDPAHPNTLHWV 

DEAGKVGEQLPLEDPDVYCPYSAIGNVTGEL 

VYAHY GRPEDLQDLRARG VDPVGRLLLVRV 

GVISFAQKVTNAQDFGAQGVLIYPEPADFSQ 

DPPKPSLS SQQ AVYGHVHLGTGE>PYTPGFPSF 

NQTQFPPVASSGLPSIPAQPISADIASRLLRKL 

KGPVAPQEWQGSLLGSPYHLGPGPRLRLWN 

NHRTSTPDNTNIFGCIEGR5EPDHYWIGAQRDA 

WGPGAAKSAVGTADLLELVRTFSSMVSNGFR 

HLKAVVYVSLDNAVLGDDKFHAKTSPLLTSL 

ffiSVLKQVDSPNHSGQTLYEQVVFTN\PSWD\ 

AE VIRPLPMNDS S A Y\SFT AF VG VP AVEF SFME\ 

DDQ\AYPFLHTKEDTYENLHKVLQGRLPAVA 

QAVAQLAGQLLIRLSHDRLLPLDFGRY GUV V 

LRHIGNLNEFSGDLKARGLTLQWVYSARGDY 

IRAAEKLRQEIYSSEERDERLTRMYNVRIMRV 

EFYFLSQYVSPADSPFRHIFMGRGDHTLGALL 

DHLRIXRSNSSGTPGATSSTGFQ\ESRFRRQL\ 

ALLMWDACKGAANALSGDVWNIDNNF 


621 


1971 


A 


4610 


793 


334 


ISRVDDFVGSGIANVIIAVAIFSIPAFARLVRG\ 

NTLVLKQQTFIESARSIGASDMTVLLRH1LPGT 

GSSIWFFTMRIGTSI1SAASLSFLGLGAQPPTP 

EWGAMLNEARADMVIAPHVAVFPALAIFLTV 

LAFNLLGDGLRDALDPKIKG 


622 


1972 


A 


4614 


2 


820 


LVYVMIAIFCIASAMSLYNCLAALIHKIPYGQ 

CHACRGKNMEVRLIFLSGLCIAVAWWAVF 

RNEDRWAWILQDILGIAFCLNLIKTLKLPNFK 

SCV1LLGLLLLYDVFFVFLTPFJTKNGESIMVEL 

AAGPFGNNEKND GNLVEATGQPS APHEKLP V 

VIRVPKLIYFSVMSVCLMPVSILGFGDDVPGL 

LIAYCRRFDVQTGSSYIYYVSVXTVAYAIGMIL 

TFVVLGVLMKKGQPALLYLVPCTLITA/CQFV 

AWETVREMKKFWERVTS 


623 


1973 


A 


4619 


17 


691 


TLVSWEFVRRADLTREDLAPSSVDSGQAGF 

GGCCESGLPNTMPSAFSVSSFPVSIPAVLTQT 

DWTEPWLMGLATFHALCVLLTCLSSRSYRLQ 

IGHFLCLVILVYCAEYINEAAAMNWRLFSKY 

QYFDSRGMHSIVFSAPLLVNAMIIVVMWVW 

KTLNVMTDLKNAQERRKEKKRRRKED*GAA 

AAWSLEPSRPPSAAPSAAVCVAWASFQLTHG 

LKNRCFI 


624 


1974 


A 


4622 


164 


668 


VSCYTALQS1MNQPESANDPEPLCAVCGQAH 

SLEENHFYSYPEEVDDDUCfflCLQALLDPLD 

TPCGHTYCTLCLTNFLVEKDFCPMDRKPLYL 

QIICKKSSILVNKLLNKLLVTCPFREHCTQVL 

QRCDLEHHFQTSQAWGTHL*SQLLGRLRQED 

CLSPGVHHCSEV 


625 


1975 


A 


4625 


474 


473 


CFLSPSPLLPPLLLSSSSSPSFPLPPPPTLLPSTLP 
PPLLIPSS*LSP 
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Amino acid sequence (A«Alanine OCysteine, 
D^Aspartic Acid, ^-Glutamic Acid, 
F-Phenylalanine, G=GJycine, H^Hisridine, 
Hsoleucine, K=Lysine, L=Leucine, 
M=Methionine, N=Asparagine, ^Proline, 
Q=01utamine, R=Arginrae, S=Serine, 
^Threonine, V=Valine, W=Tryptophan, 
Y*=Tyrosine> X=Unknown, *«=Stop codon, 
^possible nucleotide deletion, \«possible 
nucleotide insertion 


626 


1976 


A 


4629 


249 


3 


LFFET*SHSmLECSGTlSAHCSLQLQGSSNSP 
ASASQVAGIAGTHH [ 


627 


1977 


A 


4635 


1 


301 


FFFFETKPFFAFQAGGQGPSRGSLNPLPTGLK 
QFS GLTLSRS GNNGPRPPPRVNFGDLRGNGVP 
PGGAG*PRPPDLRGPPGLAPPQGGNNGGDPP 
ARAYL 


628 


1978 


A 


4648 


; 1357 


782 


KLFSSQRLrGPIUQAINPSaLLSFFPS*LLAMR 
TVGNNAFILVFLVYRIVLLLF*HV* PAYFQPSK 
NKTAKINCN*RPFLFLVCYLL*AELHIGIFIANF 
YDCIPNKLNEHLWPBCLLQSLIFHVDFCGFLHK 
VFYICFTEFLLFLYFL*LFIIKVSCSII*CSTICVF 
SYKSFAVHFFVDNTRFFSFGF 


629 


1979 


A 


4660 


18 


999 


HHELHTLELLQNPKEVLTRSEIQDVNYSLEAV 

KVKTVCQIPLMKFJvflJCRFQVAWLA^ 

PKLVFSQEGRYVKNTASASSWPVFSSAWNYF 

AGWRNPQKTAFVERFQHLSCVLGKN VFTS G 

KHYWEVESRDSLEVAVGVCREDVMGITDRS 

KMSPDVGI WAIYW S AAGYWPLIGFPGTPTQQ 

EPALHRVGVYLDRGTGNVSFYSAVDGVHLH 

TFSCSSVSRLRPFFWLSPLASLVIPPVTDRX*G 

FSSPDQNSEPWQLRDTHPWALFCPSCLYPG 

WSIFWVSLTVPFGICPLCASQEAVPWEVGLA 

NGDGTGNFPRRFWEIFL 


630 


1980 


A 


4669 


2 


358 


FFFFFETESHSVAQAGMQWRNLGSLPAPPPGF 
TPFFCLSLLNGWDYRRPPPHLANFFVLLVETG 
FHD VGQDGLDLLTS* STPSASQSAEITGVSHC 
TRLKKIRFAKGHVEFFFESHVE 


631 


1981 


A 


4674 


953 


614 


TPIRGTDDEHEECTV QE YS AGKNTCLRPGA V " 
AHTCNPCTLGGRGRWIT*GSGVQDQPGPTWQ 
NPVFLERRPRALHSSPGLTTQRILWAQGLWV 
GAGSTGCSRGPRGEGVFREG 


632 


1982 


A 


4678 


34 


314 


RSTHASGMISPSFGFMGHLLRLEFEILPSTPNP 
♦LPSYQGEAAGSSLISHLQTFSPDLKGVYCTFP 
ASGLAPVPTHWTVSELSRSPVATATFC 


633- 


1983 


A 


4696 


1 


1365 


RTLGMEGERRASQAPSSGLPAGGANGESPGG 

GAPFPGSSGSSALLQAEVLDLDEDEDDLEVFS 

KDASLMDMNSFSPMMPTSPLSMINQIKFEDEP 

DLKDLFTl'VDKPESHVrilE'lHTYRnTKTSRG 

EFDSSEFEVRRRYQDFLWLKGKLEEAHPTLII 

PPLPEKF1VKGMVERFNDDFIETRRKALHKFL 

NRIADHPTLTFNEDFKIFLTAQAWELSSHKKQ 

GPGLLSRMGQTVRAVASSMRGVKNRPEEFM 

EMNNFEELFSQKINLIDKJSQR1YKEEREYFDE 

MKEYGPIH1LWSASEEDLVDTLKDVASCIDRC 

CKATEKRMSGLSEALLPWHEYVLYSEMLM 

GVMKRRDQIQAELDSKVEVLTYKKADTDLL 

PEEIGKLEDKVECANNALKADWERWKQNM 

V^iNUUvbAr 1 UMAchNlxiY YcQLLATWESFLT 
SQTNLHLEEASEDKP 


634 


1984 


A 


4708 


421 


158 


SYWVGEDYTYKFFEVILIDPFMKAlRRNPD'rQ 
WISKAVYKHREMCGLTSTGRKSHGLEKDRM 
FPHAIGGSCRAA*RRRKTLQFPCYH 


635 


1985 


A 


4709 


42 


341 


YIKQPDAKERRRTVHWKKETESEASEITIPPST 
PGVPQAPGHWEDYGRGDNFYLPH*DPGGIVL 
WNIFNRMPIARKN1TDGEHHEYLIEVPRLFHT 
SED 


636 


1986 


A 


4721 


2 


351 


EKPDhl<hPt:GTSFIHEPRRPN*GDLVHCLGGlS 
RSTTVTVA* LMQKLNLSMNDAYYTVTMKMS S 
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ISFNFNSMDQPLDFQRTLGLRSPCYNRVPAQK 
MYFTTPSNHNAYQVDSVQST 


637 


1987 


A 


4726 i 


&4 


253 


NTGLTCS31Q1UCCGETQLYRREENRLILIXQDH 
LKSESFQVLTLSPRLEFSGLISAHCNLRLPGSS 
DSSASSSRAAGITGVHHHAWLIFFFLVETGFL 
HAG*AGLELLTSGDPPASASRSAGITGVSHHA 
RPRETRFL 


638 


1988 


A 


4734 


24 


592 


GGMDSRVSGTTSNGETKPVYPVMEKKEEDG 

TLERGHWNNKMEFVLSVAGEIIGLGNVWRFP 

YLCYKNGGGAFFIPYLVFLFTCGIPVFLLETAL 

GQYTSQGGVTAWRKICP1FEGIGYASQMIVIL 

LNVYYIIVLAWALFYLFSSFTIDLPWGGCYHE 

WNTEHCMEFQKTNGSLNGTSENATSPVIEFW 


639 


1989 


A 


4743 


1040 


699 


QGLTLLPRMECSATITAHCSLELPGSIDLPTSA 
S*VARTTGTHHHPWLILVLLL*TWGSYYVAQ 
AGLELLGSSNLPAAMVSQSAQIIGHDHCAWA 
TSNHVLYTQEGLRRGKEG 


640 


1990 


A" 


1771 


527 


2 


GRiDCPHPATVtAQPIFIDACSVLGAYQGAQN 

WIRRRPCLPSGCLKMNREIGPLQHSLCCPGWS 

QTPGLKAILLRQPPK*LGLQMESHSCPPAWSA 

MARSRLTATSASQVQAILLPQPPGTTDSCSPS 

PDHEQQPLSWVLPPPQKDMNPREQQVALGP 

QAAALPWAVWRNDCFPR 


M 


1991 


A 


4780 


16 


473 


RPSSQCOGIFT^WI^GLAPELSSELSSPPLPAR 
LQLAASPYFSPSWAECPQPVPAGTHATWCLA 
RVWARMTPPGPAGIPSHPLPPPPPERSVPIPSP 
FPARDSGSRQGHSTDRYKHTDAPRDAHRRVP 

qrdtdtgvpitgsgththahtppek: 


642 


1992 


A 


4798 


1 


487 


gysfrcdivdysrSpt^aLrmartcwlVyfsk 
fielldtiffvlrkkksqvtflhwhhtimpw 
twwfgvkfaagglgtfhallntavhvvmy 
syyglsalgpayqkylwwkkyltslqlvqf 

VTVAIHISQFFFMEDCKYQFPWACiTMSYSFM 

flllflh 


643 


1993 


A 


4799 


2 


391 


lmafiemhisgslvylkiktkiysyfsmlnfll 
qeiplseilri s sprdftnisqgsnphcfeiitdt 
mvyfvgenngdsshnpvlaatgvgldvaqs 
wekairqalmpvtpqasvctspgqgkdhsk 
q*asvctspgqgkdhskq 


644 


1994 


A 


4800 


488 


101 


AYPLFAVHPVHTECVAGWGfeAYLLCALFFL 
LSFLGYCKAFRESNKEGAHSSTFWVLLSIFLG 
AVAMLCKEQGITVLVRAATWLGPAFSVCPFP 
SYKDIWGWPCLCGVLHAYIPLLV 


645 


1995 


A 


4805 


458 


126 


LLWTTVLCQTPARPQSTM1HLGHILFLLLLPV 
AAAQTTPGERSSLPAFYPGTSGSCSGCGSLSL 
PLLAGLVAADAVASLLIVGAVFLCARPRRSP 
AQEDGKVYINMPGRG 


646 


1996 


A 


4817 


47 


1033 


LQGDTWHLSFLSHFSRLHGGVPGRGLLEGNL 

LQPQAPGHDMTSIPFPGDRLLQVDGVILCGLT 

HKQAVQCLKGPGQVAKLVLERRVPRSTQQC 

PSANDSMGDERTAVSLVTALPGRPSSCVSVT 

DGPKF+SSN*KRIANGLGFSFVQMEKESCSHL 

KSDLVRKRLFPGHPAEEKGAIAAGDIILGRE 

WEGPRKASSSRCRGSWAMQLSVQAGPSFAS 

YYPAAVEVLHLLRGAPQEVTLLLCRPPPUAL 

PELEQEWQTPELSADKEFTRATCTDSCTSPIL 

GSRGQLGGTVPPQMQGKAWGLRPESSQKAIR 

EGTMGAKTERDLGPVP 


647 


1997 


A 


4854 


1044 


335 


PRVRGDWPLEKKKSNSNIHPIFSWCGSTDSKD 
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TVMPTVni.TriQVl .vrvhxrzv n».n.#oi7/> i.v™ 

GPPWESKNSTAVWRORDSRKERLELVKLSRK 

HPELlDAAFTNFFFFiaiDmYGPIVKHSFFD 

FFKHKYQINIDGTVAAYRLPYLLVGDSVVLK 

QDSIYYEHFYNELQPWKHYIPVKSNLSDLLEK 

LKWAKDHDEEAKKIAKAGQEFARNNLMGD 

DITCYYFQTFPRNMPIYK 


648 


1998 


A 


4867 


2030 


837 


AtiMLPAVGsAbEEEDPAEEDCPELVPMErTQ 

SEEEEKSGLGAKIPVTirrGYLGAGKTTLLNYI 

LTEQHSKRVAVILNEFGEGSALEKSLAVSQG 

GELYEEWLELRNGCLCCSVKDNGLRA1ENLM 

QKKGKFDYlLLJErrrGLADPGAVASMFWVDA 

ELGSDIYLDG1ITIVDSKYGLKHLAEEKPDOU 

NEATRQVA1ADA1LINKTDLWEEDVKKLRT 

TIRSINGLGQILETQRSRVDLSNVLDLHAFDSL 

SGISLQKKLQHVPGTQPHLDQSrVTTTFDVPG 

NAKEEHLNMFIQ>^LTOKNVRNKDNHCMEV 

IRLKGLVSIKDKSQQV1VQGVI-1ELYDLEEITV 

SWKDDTERTNRLVLLGRNLDKDILKQLFIAT 

VTETEKQ WTTHFKEDQ VCT 


649 


1999 


A 


4873 


226 


189 


DGVSLLLPKLGVQWAQYWAHWQPPLPGFKR 
FSCLSLRSS WD*KCAPPHPAFVFLVEMGFHRV 
GQAGLELRTSGDPPASASQSAG[TGVSHLA*P 
TSMPLLPFQRLCVYI 


650 


2000 


A 


4874 


2 


437 


FFFLRRSFAFVAQAOVQWCDLGSPQFLPPGF 
K*FSCLSLPSSWDYRHAPPPCPS*FLYF**RQG 
FTMLARLVLNS+PHDLPTSPSQSAEIKGVSHR 
CP ASFYLFLK Y YLEAKFCA* GEC APS AGVGA 
GYKRGHKSCLL1NCWQI 


651 


2001 


A 


4898 


1701 


771 


DAWGPETRIAR1LNPDSF1EPRPGRLPELEATR 

PHMEPKASCPAAAPLMERKFHVLVGVTGSV 

AALKLPLLVSKLLDIPGL£VAVVTTERAKHFY 

SPQDIPVTLYSDADEWEMWKSRSDPVLHIDL 

RRWADLLLVAPLDANTLGKVASGICDNLLTC 

VMRAWDRSKPIXFCPAMNTAMWEHPITAQQ 

VDQLKAFGYVEIPCVAKKLVCGDEGLGAMA 

EVGTIVDKVKEVLFQHSGFQQS*PGISVMGVP 

LYSEWVQAKSVKMDVGKIGGYPHLLNGGPA 

LSLPRGQACSRLNWTEGPGLSFFQPGEAAA 


652 


2002 


A 


4927 


1 


611 


FRGRQTSRPARGFSPWRPPGTMQEPSSGECPA 

SP*LPCASNRLAFGGLIFPCAPLVPYPAPFSPLL 

PAFSCAPRPRAHTHSRTHPSAPLVPKPSSRAR 

GQSPIPSRASSPSCSWAQVPGVALARCAGVC 

KPGDSWRVAACISGRCCSRGRRRGSGPRNPE 

QSFRGAWGPSFWGSWKSQRELSAGGAQAWP 

LLGSAGSGLRGEA 


653 


2003 


A 


4965 


2 


283 


FFFFI* DGVSLCHPG WNAVARS WLTATS ASR 
VQAVSCFRLPS S WDYRHATMPG*FF* YF**R 
WGFTH All VI N*?*PnVTOPPWPPiirvr tt c\k 


654 


2004 


A 


4968 


3 


437 


RPGIPGRRFRRSWFCQLP*EPEPGLESLATPGD 

IPAVGLGALGVIPPVRVPQRPPTQRSQGRGW 

DPERDPGCRVQVSRGPRFGEQKTPGLQGCLP 

PPaTHLAAASCVWWCGRWKRDSAECQCD 

HSCSAVSQQEDRCRSSSCS 


655 


2005 


A 


4983 


201 


397 


M>O^TTCIQPSMISSMAlPnYILLCIVGVFGN 
TLSQWIFLTKIGKKTSTHIYLSHLVTANLLVC 


656 ! 


2006 


A 


4988 


332 


159 


LVHKDIvTVTOFFEEEAQASNKHVTRCLTSLVI 
REVHIKTMR*HFU>JGRLEKNKNNKD 


657 


2007 


B 


5008 


129 


465 J 
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SEQ ID 
NO: of 
nucl- 
eotide 
seq- 
uence 


SEQ ID 
NO: of 

r>er*tide 

seq- 
uence 


Met 
hod 


SEQ 1 
ID NO: i 
in 

USSN 
09/496 
914 


Predicted 

beginning 

nucleotide 

location 

correspondi 

ng to first 

amino acid 

residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue 
of peptide 
sequence 


Amino acid sequence (A«Alanine OCysteine, 
D-Aspartic Acid, EKHutamic Acid, 
F=Phenylalanine, 0=Glycine, H=Histidine, 
I=IsoIcucine, K=Lysine, L=Leucine, 
M=Methionine, N=Asparagine, P=Proline, 
Q=Glutamine, R-Arginine, S=Serine, 
T«=Threonine, V=Valine, ^Tryptophan, 
Y«Tyrosine, X-TJnknown, *-Stop codon, 
/^possible nucleotide deletion, V=possible 
nucleotide insertion 














VTLRVTCjfeSHltjftVLLKlVECjINRKQDWSDH 

AIWWEQKRQWLLQTHWTLDKYGILADARLF 

FGPQHRPVILRLPNRRALRLX* 


658 


2008 


A 


5017 


1 


292 


FFFFKETESHSVTQAGVQWHDLGSLQPPPPGF 
KRFSCl^LSSWDYRCAPPHPANFVFLVETGF 
HHVAQAGUOXTL*SANLGLSTSLPIPLFILLS 


659 


2009 


A 


5018 


17 


338 


RGHGGKSLTGGTPGNWGDGLLVSEDWSHLIF 
T*NSLVSPVLGKWSPCLQGPGLSAVHTWPWL 
MAACWAVHVKTHMRPGLAVLPRLVLNSWS 
♦AHLLWPPKALGLQA 


660 


2010 I 


A 


5028 


2 


310 


SRVDDFVGERRGGCDECLCGHRGLRAVPLG 
HPGHLCLQPPGGPA*FLDYCRGCCPHPVPGST 
AGSCPRQKKTTPGPTVLCVCSFWIYQROEPH 
HRTGARWNH 


001 


xUl 1 


A 
r\ 


JUJU 


/ 


431 


RQSCSSTQAKVQWFHYGPLQSQPPGLKQSSQ 
LSLPNSRDHRHVPPRLAIFSFAETGSPYFAQAS 
LELLGSSHPPTSASQSARITG VSHRAWPLK* F 
NLNQYQTLTMN 






A 






103 


ELNNGPFOMPLCNGGNLAVTGSWADRSPLH 

EAASQGRLLALRTLLSQGYNVNAVTLDHVTP 

LHEACLGDHVACARTLLEAGANVNATTIDGV 

TPLFNACSQGSPSCAELLLEYGAQAQLESCLP 

SPTHEGASKGHHECLDILISWGIDVDQEBPHSG 

TPLYVACMAQQFHa^LIYAGAGVRKGICY 

WDTPLPGAGHQSTQKLE*LFAMVEIWQ 


663 


2013 


A 


5066 


951 


580 


VRNS* SFAHCASVYKHH YMDGQTPCLFVSSK 
ADLPEGVAVSGPSPAEFCRKHRLPAPVPFSCA 
GPAEPSTTIFTQLATMAAFPHLVHAELHPSSF 
WLRGLLGVVGAAVAAVLSFSLYRVLVKSQ 


664 


2014 


A 


5071 


550 




LSFIEVLSMEQVNKTVYREFWLGFSSLARLQ 

QLU 7 VIFLLLYLFTLGTNAIIISTIVLDRALHTP 

MYFFLAILSCSEICYTFVIWKMLVDLLSQKK 

TISFLGCAIQMFSFLFFGSSHSFLLAAMGYDR 

YMAICNPLRYSVLMGHGVCMGLMAAAWAC 

GFTVSLVTTSLVFHLPFHSSNQHE 


665 


2015 


A 


5074 


496 


692 


QQYHNTGSAGHHAHCQVGHSPHVHYPSGCXi 
PL*IQRGLPSFNSLEGHSLKDSGHEESVQLDSE 
HDVQRSLYCDTAVNDVLKTSVTSMGSQMPD 
HDQNEGFHCREECRILGHSDRCWMPRNPMPI 
RSKSPEHVRNIIALSIEATAADVEAYDDCGPT 
KRTFATFGKDVSDHPAEERPTLKGKRTVDVT 
ICSPKVNSV1REAGNGCEAISPVTSPLHLKSSL 
PTKPSVSYEIVDPGITARRC 


666 


2016 


"A ' 


3680 


408 


248 


!MLLSTSS*VYFQSS¥RDStlFFLFDFQKTGPPL 
VGPKAQLSGLQLQPCLYKRR 


667 


2017 


A 


5081 


129 


247 


DLTNSHFFLFDFQKTGPPLGGPKAQFSSLQLQ 
PCVY+RR 


668 


2018 


A 


5086 


852 


233 


"NlKSNDRWVOlKTAYKYFF*KNGDNYNWVF 
RAUTTFADIENLKYLLFTRDASQPFYLGtrrV 
IFGDLE YVT VEGGI VL SRELMKRLNRLLDN SE 
TCADQSV1WBCLSEDKQLAICLKYAGVHAENA 
EDYEGRDVFNTKPIAQLDEEALSNNPQQWEG 
CCSDMAJTFNGLTP QKMEVMMYGL YRLRAF 
GHYFNDTLVFLPP VG SEND 


669 


2019 


A 


5101 


1 


329 


PGRPTRPPLLTLLAHVSPEPAGPSCDSLAQPG 
ASGV*VQHDSHPPLLCGSQCLSEPVPGSHGPP 
RGCQHEAAPCPRGPGSDGLHHASAACASLPP 
SPILPVLLPELGPL 


670 


2020 


A 


5102 


3 


547 


DAWGt^RCAVGAAPRLlHLHLCCTPADPSRKP 
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SEQID 
NO: of 
nucl- 
eotide 
seq- 
uence 


SEQID 
NO: of 
peptide 
seq- 
uence 


Met 
hod 


ID NO: 
in 

USSN 
09/496 
914 


! Predicted 
' beginning 
nucleotide 
location 
correspondi 
ng to first 
amino acid 
residue of 
peptide 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue 
of peptide 
sequence 


Amino acid sequence (A«Alanine OCysteme, 
D-Aspartic Acid, EKHutamic Acid, 
F-Phcnylalanine, G=Grycine, H-Histidine, 
Wsoleucine, K=Lysine, Leucine, 
M=Methionine, N=Asparagitie, P=Proline, 
Q=Glutamine, R=Arginine, S=Serine, 
^Threonine, V=VaIine, W=Tryptophan, 
Y»Tyrosine, X-Unknown, *«Stop codon, 
/=possible nucleotide deletion, V=possible 
nucleotide insertion 














VGGTDQQYVSl^SGIWSRKENG 
LQEGDK1LSVNGQDLKNLLHQDAVDLFRNA 
GYAVSLRVQHRLQVQNGPIGHRGEGDPSGIPI 
FMVLVPVFALTMVAAWAFMRYRQQL 


671 


2021 


A 


5105 


672 


400 


RDGREELCLQQEPTLPSRJCSSAPLLYFLFICPF 
VLLLLLUSLLCLYWKARKLSTLRSNTRKEKA 
L WVDLKEAGG VTTNRMED* EEDECN 


672 


2022 


A 


5148 


72 


314 


1IYFSYNIFLK1TELLNDVERLKQALNGLSQLT 
YTS GNPTKRQSQLIDTLQHQ VKSLEQQLA VS 
NQAHGALQE YVL APC S 


673 


2023 


A 


5152 


210 


335 


REILCSRJGRLNIV*MSLFFNLTCRLNAIPIKIPA 
NHFVEVT 


674 


2024 


A 


5153 


3 


2953 


LTEDQPFDILQKSLQEAN 1TEQTLAEEAYLD A 

SIGSSQQFAQAQLHPSSSASFTQASNVSNYSG 

QTLQPIGVTHVPVGASFASNTVGVQHGFMQH 

VGISVPSQHLSNSSQISGSGQIQLIGSFGNHPS 

MMTINNLDGSQIILKGSGQQAPSNVSGGLLV 

HRQTPNGNSLFGNSSSSPVAQPVTVFFNSTNF 

QTSLPVHNUIQRGLAPNSNKVPINIQPKPIQM 

GQQNTYNVNNLGIQQHHVQQGISFASASSPQ 

GSWGPHMSVNIVNQQNTRKPVTSQAVSSTG 

GSIVIHSPMGQPHAPQSQFLIPTSLSVSSNSVH 

HVQTINGQLLQTQPSQLISGQVASEHVMLNR 

NSSNMLRTNQPYTGPMLNNQNTAVHLVSGQ 

TFMSGSPVIANHASPQLVGGQMPLQQASPT 

VLHLSPGQSSVSQGRPGFATMPSVTSMSGPSR 

FPAVSSASTAHPSLGSAVQSGSSGSNFTGDQL 

l'QHNR.lVVPVSVSHRLPVSSSKSTSTFSNTPGT 

GTQQQFFCQAQKKCLNQTSPISAPKTTDGLR 

QAQIPGLLSTTLPGQDSGSKVISASLGTAQPQ 

QEKWGSSPGHPAVQVESHSGGQKRPAAKQ 

LTKGAFILQQLQRDQAHTVTPDKSHFRSLSD 

AVQRLL S YHVCQQSMPTEEDLRKVDNEFETY 

ATQLLKRTQAMLNKYRCLLLEDAMRINPPAE 

MVMIDRMFNQEERASLSRDKRLALVDPEGFQ 

ADFCCSFKLDKAAHETQFGRSDQHGSKASSS 

LQPPAKAQGRDRAKTGXOEPMNHDQFHLVP 

NHIWSAEGNISKKTECLGRALKFDKVGLVQ 

YQSTSEEKASRREPLKASQCSPGPEGHRKTSS 

RSDHGTBSKLSSILADSHLEMTCNNSFQDKSL 

RNSPKNEVLHTDIMKGSGEPQPDLQLTKSLET 

TFKNJDLELKKAGRQPQSDPTVSGSVELDFPNF 

SPMASQENCLEKFIPDHSEGWETDSILEAAV 

NSILEC 


675 


2025 


A 


5154 


599 


1880 


LKKMEPFSCDTFVALPPATVDNRIIFGKNSDR 

LYDEVQEVVYFPAVVHDNLGERLKCTYIEID 

QVPETYAWLSRPAWLWGAEMGANEHGVCI 

GNEAVWGREEVCDEEALLGMDLVRLGLERA 

DTAEKALNVIVDLLEKYGQGGNCTEGRMVF 

SYHNSFLIADRNEAWILETAGKYWAAEKVQE 

GVRMSNQLSITTKIAREHPDMKNYAKRKGW 

WDGKKEFDFAAAYSYLDTAKMMTSSGRYCE 

GYKLLNKHKGNITFETMMEILRDKPSGINME 

GEFLTTASMVFILPQDSSLPCIHFFTGTPDPER 

SVrTCPFIFVPfflSQLLDTSSPrFELEDLVKKKS 

HFKPDRRHPLYQKHQQALEVVNNNEEKAKI 

MLDNMRKLEKELFREMESILQNKHLDVEKrV 

NLFPQCTKDEIQrYQSNLSVKVSS 
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SEQID 
NO: of 
nucl- 
eotide 
seq- 
uence 


SEQID 
NO: of 
peptide 
seq- 
uence 


Met 
hod 


SEQ 
ID NO: 
in 

USSN 
09/496 
914 


Predicted 

beginning 

nucleotide 

location 

correspondi 

ng to first 

amino acid 

residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue 
of peptide 
sequence 


Amino acid sequence (A= Alanine OCysteine, 
3>-Aspartic Acid, EKJlutamic Acid, 
F«Phenylalanine, Glycine, H=Histidine, 
l=Isoleucine, K=Lysine, L=Leucine, 
M=Methionine, N=Asparagine, P=Proline, 
QOlulamine, R=Arginine, S=5erine, 
•^Threonine, V=Valine, W=Tryptophan, 
Y^Tyrosine, X«Unknown, *«Stop codon, 
/^possible nucleotide deletion, \ Ba possible 
nucleotide insertion 


676 


2026 


A 


5155 


2 


306 


FFFLRRSLALSPRPDCGLQWR3SfL6SLQAPPPG 
FTPFSCLSLPS S WD YRRPPPRPANFL YF* * RRG 
FTLLARMVSIS*PHDPPASASQSAGITGVSHRA 
RPT 


677 


2027 


A 


5167 


97 


740 


FFHSVDLLALEQSKTFYKPDWFDIVESEVKCC 
KEAVCV1DMSSFTEFEITSTGDQALEVLQYLF 
SNDLDVPVGHIVHTGMLNEGGGYENDCSIAR 
LNKRSFFMISPTDQQVHCWAWLKKIIMPKDS 
NLIXEDVTWKYTALNLIGPRAVDVLSELSYA 
PMTPDHFPSLFCKEMSVGYANGIRVMSMTHT 
GEPGFMLYIPIEYRWGFTMLSTLVSNS 


678 


2028 


A 


51^3 


1919 


2018 


PALCRLRDDMTVCVADFGLSKKIYSGDYYRQ 
GRIAKMPVKWIAIESLADRVYTSKSDVWAFG 
VTMWEIATRGMTPYPGVQNHEMYDYLLHG 
HRLKQPEDCLDELCKI**SPQSP 


679 


2029 


A 


5190 


39 


499 


RESQVKHFKMRKIDLCLSSEGSEVILATSSDE 
KHPPENIIIXjNPETFWTTTGMFPQEFIICFHKH 
VRIERLVIQSYFVQTLKIEKSTSKEPVDFEQWI 
EKDLVHTEGQLQNEEIVAHDGSATYLRFHVS 
AFDHFASVHSVSAEGTWSNLSS 


680 


2030 


A 


5204 


541 


92 


EII^VLKLACGDISLNALALMVATAVLTLAPL 

LLICLSYLFILSAILRVPSAAGRCKAFSTCSAH 

RTWWFYGTISFMYFKPKAKDPNVDKTVAL 

FYGVVTPSLNPIIYSLRNAEVKAAVLTLLRGG 

LLSRKASHCYCCPLPLSAGIG 


681 


2031 


A 


5207 


10 


247 


VPDNGDVTKLPVCSTLVEETSLTVSEAMEQSI 
JCNESPLPGTLAHTCNTSTLGGRGRWIT*GREF 
DTSMANM\TCFCLYRK 


682 


2032 


A 


5210 


2 


231 


FFFETESYSITQAGVQWPNLSSLKTLPPGFK*F 

SCLSLPSSWDYRCLPPCPANFCIFSRNGVLPC 

WPGWSRTPDLS 


683 


2033 


A 


5218 


85 


402 


CPSVSGLtKSDLRRHNINIGrTNVDVKAVSNIF 
MIILLRSMYRINVKPYFFI*LFFSRVNC*SVnG 
YARCYTFLIF*LFL*IPADSPrDQEPKTVMLSK 
QSESAI 


684 


2034 


A 


5220 


1 


194 


NLMKEMQNLNSENHKTWEE YKDTK* IMS YF 
YG*ALNVIKMAVLPKIJvlYRFSATLVKIPQHL 
TDS 


685 


2035 


A 


5228 


260 


440 


LHSQDGNSDPRKPQGEMSAHAFPVQTCGEED 
QKKTPQVPINFTELSKCS* S*KIMSGERE 


686 


2036 


A 


5239 


79 


508 


GGEAAARAAKLSSPRPHRVGRRERGVGGMS 
AFSEAALEKKLSELSNSQQSVQTLSLWLIHHR 
KHSRPIVTVWERELRKAKPNRKLTFLYLAND 
VIQNSKRKGPEFTKDFAPVIVEAFKHVSSETD 
ESCKKHLGRVLSIWEERS 


687 


2037 


A 


5244 


1 


428 


MAAVVAATALKGRGARNARVLRGILAGATA 
NKASHNRTRALQSHS SPEGKEEPEPLSPELEYI 
PRKJIGKNPMKAVGLAWAIGFPCGILLFILTKR 
EVDKDRVKQMKARQNMRLSNTGEYESQRFR 
ASSQSAPSPDVGSGVQT 


688 


2038 


A 


5249 


1 


1407 


LQQTEDKSLLNQGSSSEEVAGSSQKMGQPGP 

SGDSDLATALHRLSLRRQNYLSEKQFFAEEW 

QRKIQVLADQKEGVSGCVTPTESLASLCTTQS 

EITDLSSASCLRGFMPEKLQIVKPLEGSQTLY 

HWQQLAQPNLGTILDPRPGVITKGFTQLPGD 

AIYfflSDLEEDEEEGlTFQVQQPLEVEEKLSTS 

KPVTGIFLPPlTSAGGPVTVATANPGKCLSCr 

NSTFTFTTCRILHPSDITQVTPSSGFPSLSCGSS 

GSSSSNTAVNSPALAYRLSIGESITNRRDSTTT 
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seq- 
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SEQ ID 
NO: of 
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seq- 
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hod 


SEQ 

DNO: 

in 
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beginning 

nucleotide 

location 

correspondi 

ng to first 

amino acid 

residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
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to last amino 
acid residue 
of peptide 
sequence 


Amino acid sequence (A=AIanine (^Cysteine, 
D^Aspartic Acid, E=Glutamic Acid, 
F=Phenylalanine, OGlycine, H^Histidinc, 
Hsoleucine, K«Lysine, L«Leucine, 
M=Methionine, N=Asparagine, P=Proline, 
Q=Glutamine, R«=Arginine, S^Serine, 
Threonine, V*=VaIine, W=Tryptophan, 
Y^Tyrosine, XHJnknown, *=Stop codon, 
/^possible nucleotide deletion, possible 
nucleotide insertion 














FS STMSLAKLLQERCI3AICY Y KoriSEiNrLQrL 

PKSLAJPSTPPNSPSHSPCPSPLPFEPRVHLSEN 

FLASRPAK1VLQEMYGLRPSRNPPDVGQLKM 

NLVDRLKRLGIARWKNPGAQENGRCQEAEJ 

GPQKPDSAVYLNSGSSLLGGLRRNQSLPVIM 

GSFAAPVCTSSPKMGVLKED 1 


689 


2039 


A 


5254 


2 


2621 


LSLFGSRALrjRSGARAMAKAKKVGARkkAS 

GAPAGARGGPAKANSNPFEVKVNRQKFQILG 

RKTRHDVGLPGVSRARALRKRTQTLLKEYKE 

RDKSNVFRDKRFGEYNSNMSPEEKMMKRFA 

LEQQRHHEKKSIYNLNEDEELTHYGQSLADIE 

KHNDIVDSDSDAEDRGTLSGELTAAHFGGGG 

GLLHKKTQQEGEEREKPKSRKELIEELtAKSK 

QEKRERQAQREDALELTEKLDQDWKEIQTLL 

SliKTPKSENRDKKEKPKPDAYDMMVRELGF 

EMKAQPSNRMKTEAELAKEEQEHLRKLEAE 

RLRRMLGKDEDENVKKPKHMSADDLNDGFV 

LDKDDRRLLSYKDGKMNVEEDVQEEQSKEA 

SDPESNEEEGDSSGGEDTEESDSPDSHLDLES 

NVESEEENEKPAKEQRQTPGKGLISGKERAG 

KATRDELPYTFAAPESYEELRSLLLGRSMEEQ 

LLVVERIQKCNHPSLAEGNKAKLEKLFGFLLE 

YVGDLATDDPPDLTVIDKLWHLYHLCQMFP 

ESASDAIKFVLRDAMHEMEEMIETKGRAALP 

GLD VLIYLKITGLLFPTSDFWHPWTPALVCL 

SQLLTKCPHSLQDWKGLFVCCLFLEYVALS 

QRFIPELINFLLGILYIATPNKASQGSTLVHPFR 

ALGKNSELLWSAREDVATWQQSSLSLRWA 

SRLRATTSTEANHIRLSCLAVGLALLKRCVLM 

YGSLPSFHA1MGPLRALLTDHLADCSHPQELQ 

ELCQSTLTEMESQKQLCRPLTCEKSKPVPLKL 

FTPRLVKVLEFGRKQGSSKEEQERKRLIHKHK 

REFKGAVREIRKDNQFLARMQLSE1MERDAE 

RKRKVKQU^SLATQEGEWKALKRKKFKK 


690 


2040 


A 


5261 


1 


304 


rm VFL VETGFHHVGQAGLELLTSGDPPTW 

ASQSAGI7XJVSHCSWPVIYVLSTLLHAVRNVL 

FKRTFPLKSSSFLSYDKEIFPIUVLKFYLVTLT 
SFVK 


691 


2041 


A 


5270 


3 


158 


w^HiiHLTANWVlILPGTPPGWKIDGPAAAt " 
EVLS SFFFFFLKFSYKPQNIV 


692 


2042 


A 


5282 


56 


1268 


GMEPVGCCGECRGSSVDPRSTFVLSNLAEW ' 

ERVLTFLPAKALLRVACVCRLWRECVRRVLR 

THRSVTWISAGLAEAGHLEGHCLVRWAEEL 

ENVRILPHTVLYMADSETFISLEECRGHKRAR 

KRTSMETAtALEKLFPKQCQVLGIVTPGIWT 

PMGSGSNRPQEIEIGESGFALLFPQIEGIKIQPF 

HFIKDPKNLTLERHQLTEVGLLDNPELRWLV 

FGYNCCKVGASNYLQQWSTFSDMNIILAGG 

QVDNLSSLTSEKNPLDIDASGVVfiT <?F^r T WRr 

QSATVLLNEDVSDEKTAEAAMQRLKAANIPE 
HNTIGFMFACVGRGFQYYRAKGNVEADAFR 
KFFPSVPLFGFFGNGEIGCDRIVTGNF1LRKCN 
EVKDDDLFHSYTTIMALIHLGSSK 


693 
194 


2043 


A 


5301 


362 


507 


EEIKERFGPGL VI YW YGFIQELDCNRERGILLK 
ACFPTNIVTLCHSIA 


"695 


2044 
2045 


A 
A 


5310 
5315 


1 

L25 


204 
1596 


RVLTA1NHTLKEWLRKFYKGKKDKPLDLRPK 
KTRAMRRRLNMHEENLKTKKQHRKERLYPL 
RKYAAKA 

bXKyiAVKSEVQVCISLLLCLEDRTMPKKAKP 
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NO: of 
nucl- 
eotide 
seq- 
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SEQID 
NO: of 
peptide 
seq- 
uence 


Met 
hod 


ID NO: 
in 

USSN 
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914 


Predicted 

beginning 

nucleotide 

location 

correspond] 

ng to first 

amino acid 

residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue 
of peptide 
sequence 


Amino acid sequence, (A=Alanine 6=Cysteine, 
D=Aspartic Acid, EKilutamic Acid, 
^Phenylalanine, G-Glycinc, H-Histidme, 
Wsoleucine, K«Lysine, L=Lcucine, 
M^emionine, N=Asparagine, P=Proline, 
Q=Clutamine, R=Arginine. S=Serine, 
T=Threonine, V=Valine, W=Tryptophan, 
Y=Tyrosine, X^Unknown, *=Stop codon, 
/-possible nucleotide deletion, \= s possible 
nucleotide insertion 














TGSGKEEGPAPCKQMKLEAAGGPSALNFDSP 

SSLFESLISPIKTBTFFKEFWEQKPLLIQRDDPA 

LATYYGSLFKLTDLKSLCSRGMYYGRDVNV 

CRCVNGKKKVLNKDGKAHFLQLRKDFDQKR 

ATIQFHQPQRFKDELWRIQEKLECYFGSLVGS 

NVYITPAGSQGLPPHYDDVEVFELQLEGEKH 

WRLYHPTVPIAREYSVEAEERIGRPVHEFML 

KPGDLLYFPRGTIHQADTPAGLAHSTHVTIST 

YQNNSWGDFLLDTISGLVFDTAKEDVELRTG 

IPRQLLLQVESTTVATRRLSGFLRTLADRLEG 

TKEIXSSDMKKDFIMHRLPPYSAGDGAELSTP 

GGKLPRJLDSVVRLQFKDHIVLTVLPDQDQSD 

ETQEKMVYIYHSJLJCNSRETHMMGNEEETEFH 

GLRFPLSHLDALKQIWNSPAISVKDLKLTTDE 

EKESL VL SLWTECLIQ W 


696 


2046 


A 


5318 


1476 


742 


LMKXYLEAAELGEISDIHTKLLRLSSSQGTIET 

SLQDIDSRLSPGGSLADAWAHQEGTHPKJDRN 

VEKiQVLLNCMTElYYQFKJCDKAERRLAYN 

EEQIHKFDKQKLYYHATKAMTHFTDECVKK 

YEAFLNKSEEWIRKMLHLRKQLLSLTNQCFDI 

EEEVSKYQE YTNELQETLPQKMFTAS SGIKHT 

MTPIYPSSNTLVEMTLGMKKLKEEMEGVVKE 

LAENNHILESGGSLTMDGGLRNVDCL 


697 


2047 


A 


5320 


244 


478 


LDYNFFLFEMTFGLVSQAG VQ WHDLGSLQPP 
PPGFKQFSCLSLPSSWDYRHLPPHLANFSREG 
VSPSWPGWSRTPDFR 


698 


2048 


A 


5324 


266 


714 


LPIRKSLRS\'RSGFPTSQSP1TRNLDGTASGSC 

LAKTVTGSLFRINVGLRGLVAGGIIGALLGTP 

VGGLLMAFQKYSGETVQERKQKDRKALHEL 

KLEEWKGRLQVTEHLPEKIESSLQEDEPENDA 

KKIEALLNLPRNPSVIDKQDKD 


699 


2049 


A 


5334 


699 


277 


RPHGHLVCISSSAGLSGVNGLADYCASKFAA 

FGFAESVFVETFVQKQKGIKTTIVCPFFIKTGM 

FEGCTTGCPSLLPILEPK.YAVEKIVEAILQEKM 

YLYMPKLLYFMMFLKSFLPLKTGLLIADYLGI 

LHAMDGFADQKK 


700 


2050 


A 


5344 


3 


614 


PTAEEMS SLTPES SPEL AKRS WFGNFISLDKEE 

QIFLVLKDKPLSSKADIVHAFLSIPSLSHSVLS 

QTSFRAEYKASGGPSVFQKPVRFQVDISSSEG 

PEPSPRRDGSGGGGIYSVTFTLISGPSRRFKRV 

VETIQAQIXSTHDQPSVQALADEKNGAQTRP 

AGAPPRSLQPPPGRPDPELSSSPRRGPPKDKK 

LLATNGTPL 


701 


2051 


A 


5346 


3 


1383 


HASVLFCRVMAASKTQGAVARMQEDRDGSC 

STVGGVGY GDSKDCILEPLSLPESPGGTTTLE 

GSPSVPCffCEEIiFPVAEQDKLLKHMlIEHKIV 

IADVKLVADFQRYILYWRKRFTEQPITDFCSV 

IRINSTAPFEEQENYFLLCDVLPEDRILREELQ 

KQRLREHEQQQQERNDTNFHGVCMFCKEEF 

LGNRSVILNHMAREHAFN1GLPDNIVNCNEFL 

CTLQKKXDNLQCLYCEKTFRDKNTLFCDHMR 

KKQHRJfONPKNREYDRFYVIN YLELGKS WEE 

VQLEDDRELLDHQEDDWSDWEEHPASAVCL 

FCEKQAETIEKLYVHMEDAHEFDLLKIKSELG 

IJ^YQQVKLVNFIRRQVHQCRCYGCHVKFKS 

KADLRTHMEETKHTSLLPDRKTWDQLEYYFP 

TYENDTLLWTLSDSESDLTAQEQNENVPIISE 

DTSKLYALKQSSILNQLLL 


702 


2052 


A 


5356 


2502 


1540 


MAAATRGCRPWGSLLGLLGLVSAAAAAWD 
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]=Iso leucine, K=Lysine, L=Leucine, 
M=Methionine, N=Asparagine, P=Proline, 
Q^Glutamine, R«Arginine, S=Scrine, 
T-Threonme, V-Valine, ^-Tryptophan, 
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AGQHLAKALVVKAIJCAFVRDPAPTKPLVLSL 

HGWTGTGKSYVSSLLAHYLFQGGLRSPRVH 

HFSPVLHFPHPSHIERYKfCDLKSWVQGNLTA 

CGRSLFLFDEMDKMPPGLMEVLRPFLGSSWV 

VYGTNYRKAIFinSNTGGEQINQVALEAWRS 

RRDREEILLQELEPVISRAVLDNPHHGFSNSGI 

MEERLLDAWPFLPLQRIH IVRIICVLNIILAQL 

GLEPRDEWQAVLDSTTFFPEDEQLFSSNGCK 

TVASRIAFFL 


703 


2053 


A 


5380 


278 


657 


LFLQKD^MKTEEEARTHTEIEMFLRKEQQKL 

EERLEFWMEKYDKDTEMKQNELNALKATKA 

SDLAHLQDLAKMIREYEQVIIEDRIEKERSKK 

KVKQDLLELKSVIKLQAWWRGTMIRREIGGF 

KM 


704 


2054 


A 


5381 


1 


1003 


FRGRAVKMAAWEVEVGGGAAGERELDEV 

DMSDLSPEEQWRVEHARMHAKHRGHEAMH 

AEMVULIATLWAQLLLVQWKQRHPRSYN 

MVTLFQMWVVPLYFTVKLHWWRFLVIWILF 

SAVTAFVTFRATRKPLVQ1TPRLVYKWFLLIY 

KISYATGIVGYMAVMFTLFGLNLLFKIKPEDA 

MDFGISLLFYGLYYGVLERDFAEMCADYMA 

STIGF YSESGMPTKHLSDSV CAVCGQQtFVDV 

SEEGIIENTYRLSCNHVFHEFCIRGWCIVGKK 

QTCPYCKEKVDLKRMFSNPWERPHVMYGQL 

LDWLRYLVAWQPVnGWQGINYILGLE 


705 


2055 


A 


5396 


3 


675 


IYDRDPLQLATRAGQPLDINMAGEPKPYRPKP 

GNKRPLSALYRLESKEPFLSVGGYVFDYDYY 

RDDFYNRLFDYHGRVPPPPRAVPLKRPRVA 

VTTTRRGKGVFSMKGGSRSTASGSTGSKLKS 

DELQTDCKELTQIKTKIDSVLGRLDKIEKQQK 

AEAEAQKKLLEESLVL1QEECVSEIADHSTEEP 

AEGGPDADGEEMTDGIEEAFDEDGGHELFLQ 

DC 


706 


2056 


A 


5410 


2 


98 


ORVGLNLEGR0CSEPK\VRHCTPT*ATEQbSl ' 
S 


707 


2057 


A 


5415 


6 


287 


PFKLTPSFLSHAFSSGQERKVFIELNHIKKCNT 
VRGVFVLEEFGNYTILLLGLDSHGSNSNLGAP 
EEGLGAGRKRTSVEKSGGAGVTRKKRDP 


708 


2058 


A 


5423 


3 


291 


SSSNPLGSPSTLWRLCSFVLHNKSCCCSFFGS 
TPTLRAITLTVRVCGFIPEVSKTTNPLGRTHNS 
GCTIFKTVTLTARSTASLLKSVRPRTHQKE 


709 


2059 


A 


5424 


679 


347 


RIRHEHCRGSRGRGRRTSEEDTPKKKKHKGG 
SEFTDT1LSVHPSDVLDMPVDPNEPTYCLCHQ 
VSYGEMIGCDNPDCPIEWFHFACVDLTTKPK 
GKWFCPRCVQEKRKKK 


710 


2060 


A 


5442 


1073 


559 


QESLKKKIQPKLSLTLSSSVSRGNVSTPPRHSS 
GSLTPPVTPPITPSSSFRSSTPTGSEYDEEEVDY 
EESDSDESWTTESAISSEAILSSMCMNGGEEK 
PFACPVPGCKKRYKNVNGIKYHAKNGHRTQI 
RVRKPFKCRCGKSYKTAQGLRHHTINFHPPV 
SAEHRKMQQ 


711 


2061 


A 


5449 


1 


319 


GDSLCVPQYNKYREERVILFLKMASGHAFQP 
DLVKRIRDAIRMGLSARHVPSLILETKGIPYTL 
NGKKVEVAVKQIIAGKAVEQGGAFSNPETLD 
LYRDIPELQGF 


712 


2062 


A | 


5499 


91 


749 • 


RPTTGHGDFWMQPLTKDAGMSLSSVTLASAL 
QVRGEALSEEEIWSLLFLAAEQLLEDLRNDSS 
DYWCPWSALLSAAGSLSFQGRVSHIEAAPF 
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kapeLlqgqsedeqpdasqmhvyslgmtly 
wsagfhvpphqplqlceplhsilltmcedqph 
rrctlqsvleacrvhekevsvypapaglhir 
rlvglvlgtisevsrepcfsssscwscvaiki 


713 


2063 


A 


5506 


22 


478 


VEELILVSRLDPHLHTPMYm-AHLSFLDLSFT 
TSSIPQLLYNLNGCDKTTSYMGCAIQLFLFLGL 
GGVECLLLAVMAYDRCVAICKPLHYM\TMN 
PRLCRGLVSVWGCGVANSLAMSPVTLRLPR 
CGHHEVDHFLCEMPALIRMACISTV 


714 


2064 


A 


5514 


25 


220 


AIRPYWCENNIIGIGBCLSTADGKAFADPEVLR 
RLTSSVSCALDEAAAALTRMRAESTANAGQS 
DK 


715 


2065 


A 


5526 


3 


810 


KNHAPRRPQRYSSGHGSDNSSVLSGELPPAM 

GRTALFHHSGGSSGYESLRRDSEATGSASSAP 

DSMSESGAASPGARTRSLKSPKKRATGLQRR 

RLIPAPLPDTTALGRKPSLPGQWVDLPPPLAG 

SLKEPFEIKVYEIDDVERLQRPRPTPREAPTQG 

lAWSTRLRlAEimQQRLREVQAKHKHLCEE 

LAETQGRLMLEPGRWLEQFEVDPELEPESAE 

YLAALERATAALEOCVNXCKAHVMMVTCFD 

ISVAASAUPGPQEVDV 


716 


2066 


A 


5529 


458 


790 


SPGYGEN1CFTVTSXNIA\T>LCEM3^KIYSYYSD 
SSSSERTMDLVLEMCNTNSIHWCGISGRQLG 
KLHPSSSLCLALTLLSSVQGLQSISGLRLTDTF 
LKRTYEYDDIAQVCV 


717 


2067 


A 


5531 


3 


460 


NSEDLLKYFNPESWQEDLDNMYLDTPRYRG 
RSYHDRKSKVDLDRLKDDAKRYSCTPRNYS 
VNIREELKLANWFFPRCLLVQRCGGNCGCG 
TVN WRSCTCNS GKTVKKYHEVLQFEPGHIKR 
RGRAKTMAL VDIQLDHHERCDCIC S SRPFR 


718 


2068 


A 


5586 


311 


88 


AVLKNMAPMTALGLLDLHILNLILFLSAGEDF 

TSWSEI^rra.LWLTLWLLIEMTYCYRKVS 

KAEEAAQENA 


719 


2069 


A 


5598 


1 


330 


KNCANEAWQKILDRVLSRYDVRLRPNFGSM 
LATOSTRGLNEDELMAHGQEKDSSSESEDSC 
PPSPGCSFTEGFSFDLLNPDYVPKVDKWSRFL 
FPLAFGLFNIVAAERC 


720 


2070 


A 


5628 


798 


148 


LPPAQIPEAWLLLANVVVVLILVPLKDRLIDP 

LLLRCKLLPSALQKMALGMFFGFTSVIVAGV 

LEMERLHYIHHNETVSQQIGEVLYNAAPLSIW 

WQIPQYLLIGISEIFASIPGLEFAYSEAPRSMQG 

AIMGIFFCLSGVGSLLGSSLVALLSLPGGWLH 

CPKI)FGMNNCRMDLYFFLLAGIQAVTALLF 

VWIAGRYERASQGPASHSRFSRDRG 


721 


2071 


A 


5632 


146 


536 


MSALIVRKLRSAELTLFSELPTVLGANVNAA 
KLHETAUmAAKVKNVDLIEMLIEFGGKIYA 
RDNRGKKPSDYTWS SSAPAKCFEYYEKTPLT 
LSQLCRVNLRKATGVRGLEKIAKLNIPPRLID 
YLSYN 


722 


2072 


A 


5638 


3 


3806 


CPSLDIRSEVAELRQLENCSWEGHLQILLMF 

TATGEDFRGLSFPRLTQVTDYLLLFRVYGLES 

LRDLFPNLAVIRGTRLFLGYALV1FEMPHLRD 

VALPALGA VLRGA VRVEKNQELCHL STIDW 

G1XQPAPGANHTVGNKJLGEECADVCPGVLGA 

AGEPCAKTTFSGHTDYRCWTSSHCQRVCPCP 

HGMACTARGECCHTECLGGCSQPEDPRACV 

ACRHLYFQGACLWACPPGTYQYESWRCVTA 

ERCASLHSVPGRASTFGIHQGSCLAQCPSGFT 

RNSSSIFCHKCEGLCPKECKVGTKTIDSIQAA 
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GLVETTTGFLK1KHSFALVSLGFFKNLKURGD 

AMVDGNYTLYVLDNQNLQQLGSWVAAGLTI 

PVGKIYFAFNPRLCLEHIYRLEEVTGTRGRQN 

KAEINPRTNGDRAACQTRTLRFVSNVTEADRI 

LLRWERYEPLEARDLLSFIVYYKESPFQNATE 

HVGPDACGTQSWNLLDVELPLSRTQEPGVTL 

ASLKPWTQYAVFVRAITLTTEEDSPHQGAQS 

PIVYLRTLPAAPTVPQDVISTSNSSSHLLVRW 

KPPTQRNGNLTYYLVLWQRLAEDGDLYLND 

YCHRGLRLPTSNNDPRFDGEDGDPEAEMESD 

CCPCQHPPPGQVLPPLEAQEASFQKKFENFLH 

NAITIPISPWKVTSINKSPQRDSGRHRRAAGPL 

RLGGNSSDFEIQEDKVPRERAVLSGLRHFIBY 

RIDIHACNHAAHTVGCSAATFVFARTMPHRE 

ADGCPGKVAWEASSKNSVLLRWLEPPDPNGL 

1LKYEDCYRRLGEEATVLCVSRLRYAKFGGV 

HLALLPPGNYSARVRATSLAGNGSWTDSVAF 

YILGPEEEDAGGLHVLLTATPVGLTLUVLAA 

LGFFYGKKRNRTLYASVNPEYFSASDMYVPD 

EWEVPREQISIIRELGQGSFGMVYEGLARGLE 

AGEESTPVALKTVNELASPRECIEFLKEASVM 

KAFKCHHVVRLLGWSQGQPTLVIMELMTR 

GDLKSHLRSLRPEAENNPGLPQPALGEMIQM 

AGEIADGMAYLAANKFVHRDLAARNCMVSQ 

DFTVK1GDFGMTRDVYETDYYRKGGKGLLP 

VRWMAPESLKDGIFTTHSDVWSFGVVLWEIV ' 

TLAEQPYQGLSNEQVLKFVMD GG VLEELEGC 

PLQLQELMSRCWQPNPRLRPSFTHILDSIQEEL 

RPSFRLLSFYYSPECRGARGSLPTTDAEPDSSP 

TPRDCSPQNGGPGH 


723 


2073 


A 


5672 


1 


216 


lawldnilpekekketdkkrkrkkgahedcd 
eepqfpppsvdopmesvqsdpqngihciarkr 
ssswsysl 


724 


2074 


A 


5704 


4235 


940 


ARGRRSRPVWAASWGGRGRPAARRRPRGLA 

ATMGFELDRFDGDVDPDLKCALCHKVLEDP 

LTTPCGHVFCAGCVLPWWQEGSCPARCRGR 

LSAKELNHVLPLKRLILKLD1KCAYATRGCGR 

WKLQQLPEHLERCDFAPARCRHAGCGQVLL 

RRDVEAHMRDACDARPVGRCQEGCGLPLTH 

GEQRAGGHCCARALRAHNGALQARLGALHK 

ALKKEALRAGKREKSLVAQLAAAQLELQMT 

ALRYQKKFTEYSARLDSLSRCVAAPPGGKGE 

ETKSLTLVLIIRDSGSLGFNIIGGRPSVDNHDG 

SSSEGIFVSKIVDSGPAAKEGGLQIHDRIIEVN 

GRDLSRATHDQAVEAFKTAKEPIYVQVLRRT 

PRTKMFTPPSESQLVDTGTQTDITFEHIMALT 

KMSSPSPPVLDPYLLPEEHPSAHEYYDPNDYI 

GDIHQEMDREELELEEVDLYRMNSQDKLGLT 

VCYRTDDEDDIGIYISEIDPNSIAAKDGRIREG 

DRIIQINGIEVQNREEAVALLTSEENKNFSLL1 

ARAELQLDEG WMDDDRNDFLDDLHMDMLE 

EQHHQAMQFTASVLQQKKHDEDGGTTDTAT 

HSNQHEKDSGVGRTDESTRNDESSEQENNG 

DDATASSNPLAGQRKLTCSQDTLGSGDLPFS 

NESFISADCTDADYLGIPVDECERFRELLELK 

CQVKSATPYGLYYPSGPLDAGKSDPESVDKE 

LELLNEELRSIELECLSIVRAHKMQQLKEQYR 

ESWMLHNSGFRNYNTSIDVRRHELSDITELPE 

KSDKDS SSAYNTGESCRSTPLTLEISPDNSLRR 
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I-Isolcucine, K=Lysine, LpLeucine, 
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AAEGISCPSSEGAVGTTEAYGPASKNLLSITE 

DPEVGTPTYSPSLKELDPNQPLESKERRASDG 

SRSPTPSQKLGSAYLPSYHHSPYKHAHIPAHA 

QHYQSYMQL1QQKSAVEY AQSQMSLVSMCK 

DLSSPTPSEPRMEWKVKIRSDGTRYITKRPVR 

DRLLRJERALKIREERSGMTTDDDAVSEMKM 

(^YWSKEERKQHLVKAKEQRRRREFMMQSR 

LDCLKEQQAADDRKEMNILELSHKKMMKKR 

NKKIFDNWMTIQELLTHGTKSPDGTRVYNSF 

LSVTTV 


725 


2075 


A 


5707 


3 


1770 


QISTEVSEAPVANDKPKTXWKVQKKAADLP 

DRDTWKGRFDFLMSCVGYAIGLGNVWRFPY 

LCGKNGGGAFLIPYFLTLIFAGVPLFLLECSLG 

QYTSIGGLGVWKLAPMFKGVGLAAAVLSFW 

LNIYYIVIISWAIYYLYNSFTTTLPWKQCDNP 

WNir)RCFSNYSMVKTTNMTSAVVEFWERN 

MHQMTDGLDKPGQIRWPLATTLAIAWILVYF 

CIWKGVGWTGICVV YFSA1 YP YIML1ILFFRGV 

TLPGAKEGILFYITPNFRKLSDSEVWLDAATQ 

IFFSYGLGLGSLIALGSYNSFHNNVYRDSIIVC 

CmSCTSMFAGFVIFSIVGFMAHVTKRSIADV 

AASGPGLAFLAYPEAVTQLPISPLWAILFFSM 

LLMLGIDSQFCTVEGFITALVDEYPRLLRNRR 

ELFIAAVaiSYUGLSNrTQGGIYVFKJLFDYYS 

ASGMSLLFLVFFECVSISWFYGVNRFYDNIQE 

MVGSRPCIWWKLCWSFFTPI1VAGVFIFSAVQ 

MTPLTMGNYVFPKWGQGVGWLMALSSMVL 

IPGYMAYMFLTLKGSLKQRIQVMVQPSEDIV 

RPENGPEQPQAGSSTSKEAYI 


726 


2076 


A 


5711 


156 


423 


PRRDPGRTPELRGSAPRKTGANMPVRRGHVA 

PQNTFLGTIIRKFEGQNKKF1IANARVQNCAII 

YCNDGFCEMTGFSRPDVMQKPCTCD 


111 


2077 


A 


5716 


3 


274 


HASEYFFKLCSFQVFLSFPLATIVIDVGLVVIP 
LVKSPNVWVYVLLLVLSGLLFYIPLIHFKIRL 
AWFEKMTCYLQLLFNICLPDVSEE 


728 


2078 


A 


5737 


1899 


649 


1QASRASPYPRVKVDFALSCHEDLLAPISEPIE 

WKYHSPBEHSLGPACWLWDFLRJRSQQAGFL 

LPLSGGVDSAATACLIYSMCCQVCEAVRSGN 

EEVLADVRTrVNQISYTPQDPRDLCGRILTTC 

YMASKNSSQETCTRARELAQQIGSHHISLNID 

PAVKAVMGIFSLVTGKSPLFAAHGGSSRENL 

ALQNVQARIRMVLAYLFAQLSLWSRGVHGG 

LLVLGSANVDESLLGYLTKYDCSSADINPIGG 

1SKTDLRAFVQFCIQRFQLPALQSILLAPATAE 

LEPLADGQVSQTDEEDMGMTYAELSVYGKL 

RKVAKMGPY SMFC3CLLGM WRHICTPRQ VAD 

KVKRFFSKYSMNRHKMTTLTPAYHAENYSPE 

DNRFDLRPFLYNTSWPWQFRCIENQVLQLER 

AEPQSLDGVD 


729 


2079 


A 


5741 


1 


5976 


PGCAARLSRARAPGPGAAGAGRKRLADPGPP 
PASRRLRAPGSRPRLAPCTRRAAQPAHARMA 
PRAAGGAPLSARAAAASPPPFQTPPRCPVPLL 
LLLLLGAARAGALEIQRRFPSPTPTNTNFALDG 
AAGTVYLAAVNRLYQLSGANLSLEAEAAVG 
PVPDSPLCHAPQLPQASCEHPRRLTDNYNKJL 
QLDPGQGLVWCGSIYQGFCQLRRRGNISAV 
AVRFPPAAPPAEPVTVFPSMLNVAANHFNAS 
TVGLVLPPAAGAGGSRLLVGATYTGYGSSFF 
PRNRSLEDHRFENTPEIAIRSLDTRGDLAKLFT 
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urn WPcnnwiT irjvryn. a wmrvj ocifexrY rVi* ' 

SDPPPGAQSYAYLALNSEARAGDKESQARSL 

LARICLPHGAGGDAKKLTBSY1QLGLQCAGG 

AGRGDLYSRLVSVFPARERLFAVFERPQGSPA 

ARAAPAALCAFRFADVRAAIRAARTACFVEP 

APDWAVLDSVVQGTGPACERKLN1QLQPEQ 

LDCGAAHLQHPLSILQPLKATPVFRAPGLTSV 

AVASVNNYTAVFLGTVNGRLLKINLNESMQ 

WSRRWTVAYGEPVHHVMQFDPADSGYLY 

LMTSHQMARVKVAACKVHSTCGDCVGAAD 

AYCGWCALETRCTLQQDCTNSSQQHFWTSA 

SEGPSRCPAMTVLPSEIDVE.QEYPGM1LQISGS 

LPSLSGMEMACDYGNN1RTVARVPGPAFGHQ 

IAYCNLLPRDQFPPFPPNQDHVTVEMSVRVN 

GRNrVK^NFTIYDCSRTAQVYPHTACTSaSA 

QWPCFWCSQQHSCVSNQSRCEASPNPTSPQD 

CPRTLLSPLAPVPTGGSQNILVPLANTAFFQG 

AALECSFGIEEWFAVWWSVVRCDQVVlJrl 

TTRKSQVFPLSLQLKGRPARFLDSPEPMTVM 

VYNCAMGSPDCSQCLGREDLGHLCMWSDGC 

RLRGPLQPMAGTCPAPEIRAIEPLSGPLDGGT 

LLTIRGRNLGRRLSDVAHGVWIGGVACEPLP 

DRYTVSEEIVCVTGPAPGPLSGVVTVNASKE 

GKSRDRFSYVLPLVHSLEPTMGPKAGGTRITI 

HGNDLHVGSELQVLVNDTDPCTELMRTDTSI 

ACTMPEGALPAPVPVCVRFERRGCVHGNLTF 

WYMQNPVITAISPRRSPVSGGRTITVAGERFH 

MVQNVSMAVHH1GREPTLCKVLNSTLITCPSP 

GALSNASAPVDFFINGRAYADEVAVAEELLD 

PEEAQRGSRFRLDYLPNPQFSTAKREKWIKH 

HPGEPLTLVHVSTKGAGKEQDSLGLQSHEY 

RVKJGQVSCDIQIVSDRHHCSVNESLGAAVGQ 

LPITIQVGNFNQTIATLQLGGSETAUVSIVICSV 

LLLLSWALFVFCTKSRRAERYWQKTLLQME 

EMESQIREEIRKGFAELQTDMTDLTKELNRSQ 

G1PFLEYKHFVTRTFFPKCSSLYEERYVLPSQT 

LNSQGSSQAQETHPLLGEWKIPESCRPNMEE 

GISLFSSLLDNKHFLIVFVHALEQQKDFAVRD 

RCSLASLLTIALHGKLEYYTSIMKELLVDLID 

ASAAKNPKLMLRRTESVVEKMLTNWMSrCM 

YSCLRETVGEPFFLLLCADCQQINKGSIDArrG 

KARYTLNEEWLLRENIEAKPRNLNVSFQGCG 

MDSLSVRAMDTDTLTQVKEKILEAFCKNVPY 

SQ WPRAJED VDLE WFAS STQS YELRDLDDTS V 

VEDGRKKLNTLAHYKIPEGASLAMSLIDKKD 

NTLGRVKDLDTEKYFHLVLPTDELAEPKKSH 

RQSHRKKVLPEIYLTRIXSTKGTLQKFLDDLF 

KAILSIREDKPPLAVKYFFDFLEEQAEKRGISD 

PDTLHIWKTNSLPIJIFWVNILKNPQFVFDIDK 

TDHIDACLSVIAQAFIDACSISDLQLGKDSPTN 

KLLYAKEIPEYRKIVQRYYKQIQDMTPLSEQE 

MNAHLAEESRKYQNEFNTNVAMAErYKYAK 

RYRPQIMAALEANPTARRTQLQHKFEQWAL 

MEDNIYECYSEA 


730 


2080 


A 


5744 


3 


292 | 


QPSPLFHSHLETLQLLRTAQLPEQVSWPWGO 
VANGKGNQRNMGSPQPSLLAFERNLELQIMG 
LGYSLLMGKLRPRVAKDTLRVHRDSTPSPLT 
LKD 


731 


2081 


A 


5747 


1 


382 | 


FLKCMRKAFRSSKLLQVGYTPDGKDDYRWC 
FRVDEVNWTTWNTNVGIINEDPGNCEGVKRT 
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nucl- 
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seq- 
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SEQID 
NO: of 
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seq- 
uence 
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hod 
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in 
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nucleotide 
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vu I leap U11U1 
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amino acid 
residue of 
peptide 
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corresponding 

tn tact amino 

acid residue 
of peptide 
sequence 


Amino acid sequence (A^Alanine G=Cystcine, 
D^Aspartic Acid, E=Glutamic Acid, 
F=Phenylalanine, G=Glycine, H=Histidine, 
3=Isoleucine, K=Lysine, L=Leucine, 
M=Methionine, N=Asparagine > P=Proline, 
Q=Glutamine, R=Arginine, S=Serine, 
TWhreonine, V^Valine, W-Tryptophan, 
Y-Tyrosine, X-Unknown. *=Stop codon, 
/=possible nucleotide deletion, Y=possible 
nucleotide insertion 














LSFSLkSSRVSGRHWKNFALVPLLREASARD 

RQSAQPEEVYLRQFSGSLKPEDAEVFKSPAAS 

GEK 


732 


2082 


A 


5753 


198 


3 


AQAES STV ASPE AT AGPLCTRIPNVPPFTPIRP 
PGKXQAQLPCPSPVRFTSARIPPASRPQTKS 


733 


2083 


A 


5754 


2 


2223 


AAGPPGLEAEGRAPESAGPGPGGDAAETPGL 

PPAHSGTmMAFRDVTVQIANQNISVSSSTAL 

SVANCLGAQTVQAPAEPAAGKAEQGETSGR 

EAPEAPAVGREDASAEDSCAEAGASGAADG 

ATAPKTEEEEEEEBTAEV GRGAEAEAGDLEQ 

LNRTSTSTKSAKSGSEASASASKDALQAMILS 

LPRYHCENPASCKS PTLSTDTLRKRLYR1GLN 

LFNINTDKGIQrLISRGFIPDTPlGVAHFLLQRK 

GLSRQM1GEFLGNSKKQFNRDVLDCWDEM 

DFSSMELDEALRKFQAHIRVQGEAQKVERLIE 

AFSQRYCMCNPEVVQQFHNPDTIFILAFAIILL 

NTDMYSPNIKPDRKMMLEDFIRNLRGVDDG 

ADIPRELVVGIYERIQQKELKSNEDHVTYVTK. 

NKLQKQAAHQREVFLFNDLLVILKLCPKKKS 
SSTYTFCKSVGLLGMQFQLFENEYYSHGITLV 
TPLSGSEKKQVLHFCALGSDEMQKFVEDLKE 
^lAFVTFT FOTPTFWFT FK'OOGTKTT SFKPCGA 

QGDPQSKQGSPTAKREAALRERPAESTVEVSI 

HNRLQTSQHNSGLGAERGAPVPPPDLQPSPPR 

QQTPPLPPPPPTPPGTLVQCQQIVKVIVLDKPC 

LARMEPLLSQALSCYTSSSSDSCGSTPLGGPG 

SPVKVTHQPPLPPPPPPYNHPHQFCPPGSLLH 

GHRYSSGSRSLV 


734 


2084 


A 


5788 


8 


362 


SSVMGDLVGQGLEEQIVARDENSWLDDGGTP 
IDDVMRVLDIDEFPQ SGNYETIGGFMMFMLR 
KIPKRTDSVKFAGYKFEWDIDNYRIDQLLVT 
RIDSKATALSPKLPDAKDKEESVA 


735 


2085 


A 


5827 


1 


1257 


MVFSAVLTAFHTG TSNTfFV V YENTYMNITL 

PPPFQHPDLSPLLRYSFETMAPTGLSSLTVNST 

AVPTTPAAFKSLNLPLOITLSAIMIFILFVSFLG 

NL WCLMVY QKAAMRSAINILLASLAFADM 

LLAVLNMPFALVTILTniWIFGKFFCRVSAMF 

FWLFVrEGVAILLIISIDRFLIIVQRQDKLNPYR 

AKVLIAVSWATSFCVAFPLAVGNPDLQIPSRA 

PQCVFGYTTNPGYQAYVILISLISFFPFLVILY 

SFMGILNTLRHNALRIHSYPEGICLSQASKLGL 

MGLQRPFQMSIDMGFKTRAFTTILILFAVFIVC 

WAPrTTYSLVATFSKHFYYQHNFFEISTWLL 

WLCYLK^ALNPLri'YWRIKKFHDACLDMMP 

KSFKFLPQLPGHTKRRIRPSAVYVCGEHRTVV 


736 


2086 


A 


5870 


3 


268 


FTRSDELARHYRTHTGEKRFSCPLCPKQFSRS 
DHLTKHARRHPTYHPDMIEYRGRRRTPRIDPP 
LTSEVESSASGSGPGPAPSFTTCL 


737 


2087 


A 


5871 


2 


521 


LTWPQLFLETLPELXHMSRPAEDGPSPGALVR 

RSSSLGYISKAEEYFLLKSRSDLMFEKQSERH 

GLARRLTTARRPPAS SEQAQQELFNELKPAV 

DGANFIVNHN4RDQNNYNEEKDSWNRVART 

VDRLCLFVVTPVMWGTAWIFLQGVYNQPPP 

QPFPGDPYSYNVQDKRFI 


738 


2088 


A 


5881 


I 


1160 


LVVTAJTADLAFPNEYTRMSTSELISELFNDCG 
LLDSSKLCDYENRFNTSKGGELPDRPAGVGV 
YSAMWQLALTLILKJX^TTFTFGMKIPSGLFIPS 
MAVGAJAGRIXGVGMEQLAYYHQEWTVFNS 
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SEQ ID 
NO: of 
nucl- 
eotide 
seq- 
uence 


SEQ ID 
NO: of 
peptide 
seq- 
uence 


" Met 
hod 


SEQ 
ID NO: 
in 

USSN 
09/496 
914 


Predicted 

beginning 

nucleotide 

location 

correspondi 

ng to first 

amino acid 

residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue 
of peptide 
sequence 


Amino acid sequence (A=AJanine C=Cysteine, 
D=Aspartic Acid, E=QIutamic Acid, 
{^Phenylalanine, G=Glycine, H=Histidine, 
Msoleucine, K=Lysine, L-Leucine, 
M=Methionine, N=Asparagine J P=Proiine, 
QKjIutamine, R-Arginine, S=Serine, 
^Threonine, V=»Valine, W«Tryptophan, 
y^Tyrosine, X«Unknown, *=Stop codon, 

/^possible nucleotide deletion V=nrmihle 

nucleotide insertion 














wcsqgadcitpglyamvgaaaclggvtrmf" 
vsl\Mmfeltggljeyi\tu^aaamtskwva 

DALGREGlYDAPMLNGYPFLliAKEEFAHKTL 

AMDVMKPRRNDPLLTVLTQDSMTVEDVETII 

SETTYSGFPYVVSRESQRLVGFVLRRDLDSIE 

NARKKQDGVVSTSnYFTEHSPPLPPYTPPTLK 

LRNIUDr^PrTVTDLTPMEIVVDIFRKLGLRQC 

LVTHNGRJULGDTKKDVLKHIAQMANQDPDSI 




2089 


A 


5892 


2 


916 


TLQLAASVPFFAISLISWWLPESARWLHNGKP 

DQALQELRKVARINGHKEAKNLTIEVLMSSV 

KEEVASAKEPRSVLDLFCVPVLRWRSCAMLV 

VNFSLUSYYGLVFDLQSLGRJDIFLLQALFGA 
VDFl riRATTAl T I QT7T niJPTTnAr'cn a w km 

AILANMLVPQDLQTLRVWAVLGKGCFGISL 
TCLTIYKAELFPTPVRMTADGILHTVGRLGA 
MMGPLILMSRQALPLLPPLLYGVISIASSLWL 
FFLPETQGLPLPDTIQDLESQKSTAAQGNRQE 
AFTVESTSLLETVAT WfiAT 


740 


2090 


A 


5900 


2 


426 


RPIKTLG1GFHFSVDGVHFLTQREVQNLWKE 

NLIILDTAKKHGYEWDTFTTTMGRYKEFLQG 

KCGCHFHEWKSKLSKEYNFIKMKRSRNHIM 

GRYFSNQSKLQQGTVTNFRSPYHVRGPINQV 

CSEILLSRMCANKRTM 


741 


2091 


A 


5910 


3 


412 


RMPF^TI I TTPFWfJVIt T7ADT DTTirnCP^rvrrrrtV 
tssvirao l LtL>lx v^can vj i ULJtArLr 1 LlVybfcXlDHDV 

VSYEIKDMCDCCFHFSSVKSKILRLIEIEKRER 
QRELKEKiREERRNKLAAEMGEDGEKEFQEE 
EEEKEEEEEEEEPLPEIFIPSTPSPILCGFYSEPG 
KFWV 


742 


2092 


A 


5936 


1 


482 


MGCRLLCCWFCLLQAGPLDTAVSQTPKYLV 

TQMGNDKSEKCEQNLGHDTOYWYKQDSKK 

FLKJMFSYNNKELIINETVPNRFSPKSPDKAHL 

NLHINSLELGDSAVYFCASSQDTALQSHCIPV 

HKPPGSARKLQGSVCTCTQGS SLHSLMASDG 

VPVC 


743 


2093 


A 


5938 


1 


1566 


MNSFFGTPAASWCLLESDVSSAPDKEAGRER 
RALSVQQRGGPAWSGSLEWSRQSAGDRRRL 
GLSRQTAKSSWSRSRDRTCCCRRAWWILVPA 
ADRARRERFIMNEKWDTNSSENWHPIWNVN 
DTKilHLYSDINITYVNYYLHQPQVAAIFIISYF 
LIFFLCMvlG>nVVCFIVMRNKHMHTVTNLFI 

LNLATSDT T VfTrFTTV/rPTTT I n\T7T A fjAimxr/^ xtttv x 

CKISGLVQGISVAASVFTLVAIAVDRFQCWY 

VQEEKYYRVRLNSQNKTSPVYWCREDWPNQ 

EMRKIYTTVLFANIYLAPLSLIVIMYGRIGISLF 

RAAVPHTGRKNQEQWHVVSRKKQKIIKMLLI 

VALLFILSWLPLWTLMMLSDYADLSPNELQn 

NTVTrTFAHWLAFGNSSVNPIIYGFFNENFRRG 

FQEAFQLQLCQKRAICPMEAYALKAKSIIVLIN 

TSNQLVQESTFQNPHGETLLYRKSAEKPQQE 

LVMEELKETTNSSEI 


744 


2094 


A 


5961 


149 


327 


SHVCVSHYAGSSGCPAGAGAGAVAIGISAVA 
LYDYQGGRLGVARGAWYMEAPDIRQGDM 


745 


2095 


A 


5970 


413 


856 


GAPHTDWAWAPTPMSGLGSGRGRQGTLASS 
PLSLPLLLAGVTGILATELFDQMARPAACMV 
CGALMWIMLILVGLGFPF[MEALSHFLYVPFL 
GVCVCGAJrnXjLFLPETKGKTFQEISKELHRL 
NFPRRAQGPTWRSLEVIQSTEL 
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nucl- 
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seq- 
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SEQID 
NO: of 
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seq- 
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hod 
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in 
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nucleotide 
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conespondi 

ng to first 

amino acid 

residue of 

peptide 

sequence 
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nucleotide 
location 
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to last amino 
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of peptide 
sequence 


Amino acid sequence (A-Alanine C-Cysteine, 
D=Aspartic Acid, B=Glutamic Acid, 
F=Phenylalanine, (MSlycine, H=Histidine, 
I«IsoIeucine, K^Lysine, L«Leucme, 
M=Methionine, N=Aspamgine, F=Proline, 
Q=Glutamine, R=Arginine, S=Serine, 
T=Threonine, V^Valine, W=Tryptophan, 
Y=Tyrosine, X=Unknown, *=Stop codon, 
/=possible nucleotide deletion, V=possible 
nucleotide insertion 


746 


2096 


A 


5971 

; 


3 


1343 


AQTARJOIGLELDTEGHRLFVAFSGC1VYLPLS 

RGARHGACQRSCLASQDPYCGWHSSRGCVD1 

RGSGGTDVDQAGNQESMEHGDCQDGATGSQ 

SGPGDSAYGVRRDLPPASASRSVPIPLLLASV 

AAAFALGASVSGLLVSCACRRAHRRRGKDIE 

TPGLPRPLSLRSLARLHGGGPEPPPPSKDGDA 

VQTPQL YTTFLPPPEGVTPPEL ACLPTPE STPE 

LPVKHLRAAGDFWEWNQNRNNAK±SGPGRSR 

GGHAAGGPAPRVLVRPPPPGCPGQAVEVTTL 

EELLRYLHGPQPPRKGAEPPAPLTSRALPPEP 

APALLGGPSPRPHECASPLRLDVPPEGRCASA 

PARPALSAPAPRLGVGGGRRLPFSGHRAPPAL 

T/mVPSGGPSRYSGGPGKHlXYLGRPEGYRG 

RALKRVDVEKPQLSLKPPLVGPSSRQAVPNG 

GRFNF 


747 


2097 


A 


5998 


2 


754 


DHASLPCSWNHRFDVETRHVFIGDHSGQVT1 

LKLEQENCTLVTTFRGHTGGVTALCWDPVQ 

RVLFSGSSDHSVTMWDIGGRKGTATELQGHN 

DRVQALSYAQHTRQLISCGGDGGIVVWNMD 

VERQETPEWLDSDSCQKCDQPFFWNFKQMW 

DSKKIGLRQHHCRKCGKAVCGICCSSKRSSIPL 

MGFEFEVRVCDSCHEAITDEERAPTATFHDSK 

HNIVHVHFDATRGWLLTSGTDKVIKXWDMT 

PWS 


74S 


2098 


A 


6001 


2 


747 


AMVFGGWP Y VPQ YRDIRRTQN AD GFST Y V 

CLVLLVANILRILFWFGRRJFESPLLWQSAIMIL 

TMLLMLKLCTEVRVANELNARRRSFTAADS 

KDEEVKVAPRRSFLDFDPHHFWQWSSFSDYV 

QCVLAFTGVAGYITYLSIDSALFYETLGFLAV 

LTEAMLGVPQLYRNHRHQSTEGMS1KMVLM 

WTSGDAFKTAYFLLKGAPLQFSVCGLLQVLV 

DLAILGQAYAFARHPQKPAPHAVHPTGTKAL 


749 


2099 


A 


6002 


2 


447 


GRPDRSELVRMH1LEETFAEPSLQAT.QMKLK 

RARLADDLNEKIAQRPGPMELVEKNILPVDSS 

VKEAIIGVGKEDYPHTQGDFSFDEDSSDALSP 

rK3PASQESQGSAASPSEPKVSESPSPVITOTP 

AQFASVSPTVPEFLKTPPTAD 


750 


2100 


A 


6004 


2 


427 


LLTQAMLVLPHRPQWFTPGPRLQAQGFCQEG 

WRWELRLRNYVPEDEDLNKRRVPQAKPDAV 

QEKVKEQLEAAKPEPVIEEVDLAKLAPRKPD 

WDLBCRDVAKKLEKLLKRTQRAIAELIRERLK 

GQEDSLDSAVDAATEHKTC 


751 


2101 


A 


6007 


33 


1280 


TDQAKVDNQPEKLVRSAEDVSTVPTQPDNPF 

SIffDKLKRMSKSVPAFLQDESDDRETDTASE 

SSYQLSRHKKSPSSLTNLSSSSGMTSLSSVSGS 

VMSVYSGDFGNLEVKGNIQFAIEYVESLKEL 

HVFVAQCKDLAAADVKKQRSDPYVKAYLLP 

DKGKMGKKKTLVVKKTLNPVYNEELRYKIEK 

QrLKTQKLNLSlWHRDTFKRNSFLGEVELDLE 

TWDWDNKQNKQLRWYPLKRKTAPVALEAE 

NRGEMKLALQYVPEPVPGKKLPTTGEVHTWV 

KECXDLPLLRGSHLNSFVKCTHPDTSRKSRQ 

KTRAVGKTTNPIFNHTMVYDGFRPEDLMEAC 

VELTVWDHYKLTNQFLGGLRIGFGTGKSYGT 

EVDWMDSTSEEVALWEKMVNSFNTWIEATL 

PLRMLLIAKISK 


752 


2102 


A 


6028 


108 


1283 


KEIFSPFELISVKPLCLLLGVTCSQSMAFEELL 

SQVGGLGRFQMLHLVHLPSLMLLIPHILLENF 

AAAIPGHRCWVHMLDNNTGSGNETGILSEDA 
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Amino acid sequence (A^Alanine OCysteine, 
D=Aspartic Acid, E=Glutamic Acid, 
F=Phenylalanine, 0=01ycine, JMIistidine, 
Hsoleucine, K-Lysine, L-Leucine, 
M=Methiontne, N=Asparagine, P=Proline, 
Q=Glutamine, R=Arginine, S^Serine, 
T-Threonine, V=Valine, W»Tryptophan, 
Y-Tyrosinc, X-Unknown, ♦"Stop codon, 
/=possible nucleotide deletion, \=possible 
nucleotide insertion 














TIHSTSEADTEPCVDGWVYDQSYFPSTIVTKW 

DLVCDYQSLKSWQFLLLTGMLVGGIIGGHV 

SDRFGRRFCLRWGLLQLAITDTCAAFAPTFPV 

YCVLRFLAGFSSMIIISNNSLPITEWIRPNSKAL 

WILSSGALNIGQnLGGLAYVFRDWQTLHW 

ASVPFFVFFLL SRWLVESARWLI1TNKLDEGL 

KALRKVARTNGKNAEETLNIEWRSTMQEE 

LDAAQTKTTVWDLFRNPSMRKRICILVFLRK 

KNLKEKA 


753 


2103 


A 


6043 


1 


1470 


DSFESILRLIFEIHHSGEKGDIWFLACEQDIEK 
VCETVYQGSNLNPDLGELVWPLYPKEKCSL 
FKPLDETEKRCQVYQRRWLTTSSGEFLIWSN 
SVRFV1DVGVERRKVYNPRIRANSLVMQPISQ 

<jnAFrRVnTT fi^^nifFFrT YTPPFA^VmylTP 
OV^/YDUN^V^JJbVJOOOuVJ£I\J;rv>.Lr I iXjLjl t\0]\±Jiyllr 

LKPAEMQEANLTSMVLFMKRIDIAGLGHCDF 

MNRPAPESLMQALEDLDYLAALDNDGNLSE 

FGDMSEFPLDPQLSKSILASCEFDCVDEVXTIA 

AMVTAPNCFSHVPHGAEEAALTCWKTFLHPE 

GDHFTLISrYKAYODTTT N^S^FYfVFKWrRr) 

YFLNCSALRMADVIRAELLEIiKRIELPYAEPA 

FGSKENTLNIKKALLSGYFMQIARDVDGSGN 

YLMLTHKQVAQLHPtSGYSITKKMPEWVLF 

HKFS1SENNY1R1TSE1SPELFMQLVPQYYFSNL 

PPSESKDHQQVVDHLSPVSTMNKEQQMCET 

CPETEQRCTLQ 


754 


2104 


A 


6055 


2 


394 


YYALHHWPFPDLLCQTTGAIFQMNMYGSCIF 

LMLINVDRYAAIVHPLRLRHLRRPRVARLLC 

LGVWALILVFAVPAARVHRPSRCRYRDLEVR 

LCFESFSDELWKGRLLPLVLLAEALGFLLPLA 

AVVYSS 


755 


2105 . 


A 


6059 


3 | 


1795 


LGLGSGTLLSVSEYKKKYREHVLQLHARVKE 

RNARSVKITKRFTKLLIAPESAAPEEALGPAEE 

PEPGRARRSDTHTFNRLFRMEEGRRPLTVVL 

QGPAGIGKTMAAKKILYDWAAGKXYQGQVD 

FAFFMPCGELLERPGTRSLADLILDQCPDRGA 

PVPQMLAQPQRLLFILDGADELPALGGPEAAP 

CTDPFEAASGARVLGGLLSKALLPTALLLVTT 

RAAAPGRLQGRLCSPQCAEVRGFSDKDKKK 

YFYKFFRDERRAERAYRFVKENETLFALCFV 

PFVCWIVCTVLRQQLELGRDLSRTSKTTTSVY 

LLFITSVLSSAPVADGPRLQGDLRNLCRLARE 

GVLGRRAQFAEKELEQLELRGSKVQTLFLSK 

KELPGVLETEVTYQFIDQSFQEFLAALSYLLE 

DGGVPRTAAGGVGTLLRGDAQPHSHLVLTT 

RFLFGLLSAERMRDIERHFGCMVSERVKQEA 

LRWVQGQGQGCPGVAPEVTEGAKGLEDTEE 

PEEEEEGEEPNYPLELLYCLYETQEDAFVRQA 

LCRFPELALQRVRFCRMDVAVLSYCVRCCPA 

GQALRLISCRLVAAQEKKKKSLGKRLQASLG 

GG 


756 


2106 


A 


6060 


12 


436 


SGRPTRPAKPTGQGMGRFMLTLVCQGSIMMS 
ARDLIMNNLTELQPGLFHHLRFLEELRLSGNH 
LSHIPGQAFSGLYSLKILMLHNNQLGGIPAQA 
L WELPSLQSLRLDANLISLVPERSFEGLS SLRH 
LWLDDNALTE1PS 


757 


2107 


A 


6063 


54 


419 


1TPLGLGAADMCAFPWLLLLLLLQEGSQRRL 
WRWCGSEEWAVLQESISLPLEIPPDEEVENI1 
WSSHKSLATYVPGKEGHPATIMVTNPHYQG 



236 



WO 01/57188 PCT/US01/03800 



SEQID 
NO: of 
nucl- 
eotide 
seq- 
uence 


SEQID 
NO: of 
peptide 
seq- 
uence 


Met 
hod 


SEQ 
ID NO: 
in 

USSN 
09/496 
914 


Predicted 

beginning 
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Amino acid sequence (A=Alanine OCysteine, 
D=Aspartic Acid, EHHutamic Acid, 
F=PhenylaIanine, OGlycine, HhHistidine, 
Msolcucine, K^Lysine, LHLeucine, 
M«Memioninc, N^Asparagine, FHProIine, 
Q=Glutamine, R=Arginine, S*=Serine, 
Threonine, V=VaIine, W=Tryptophan, 
Y=Tyrosine, X=Unknown, *=Stop codon, 
/^possible nucleotide deletion, V=possible 
nucleotide insertion 














QILTMIXRSLQQPSASWPRDCSSSCSW 


758 


2108 


A 


6066 


125 


438 


IGISCPAtltVPMFSHSLIGlGEEYQLPYYNMV 
PSDPSYEDMREWCVKRLRPIVSNRWNSDEC 
LRAVLKLMSECWAHNPASIU^TALRIKKTLAK 
MVESQDVK1 


759 


2109 


A 


6072 


3 


6S0 


PGRRFRPAALEERAMEKLREICVPFQNRGKGT 

LSSIIPNNSDTRKATETTSLSSKPEYVNPDFRW 

SKDPSSKSGNLLETSEVGWTSNPEELDPIRLA 

LLGKSGLSCQVGSATSHPVSCQEPIDEDQRISP 

KDKSTAGREFSGQVSHQTTSENQCTPIPSSTV 

HSSVADMQNMPAAVHALLTQPSLSAAFFAQ 

RYLGTLPSTGSTTLPQCHAGNATVW 


760 


2110 


A 


6077 


3 


730 


^LRLTLMEEVIXLGLKlDkEGYTSFWNDCrSSG 

LRGCMLIELPLRGRLQLEACGMRRKSLLTRK 

VICKSDAFrGDVLLDEALKHVKETQPPETVQ 

NWIELLSGETWNPLKLHYQLRNVRERLAKNL 

VEKGVLTTEKQNFLLFDMTTHPLTNNNIKQR 

LIKKVQEAVLDKWVNDPHRMDRRLLALIYL 

AHASDVLENAFAPLLDEQYDLATKRVRQLLD 

LDPEVECLKANTNEVLWAWAAFTK 


761 


2111 


A 


6078 


833 


390 


1VSFHLSGFKKFVRPFSFLSVHGLQVDEYHSV 
HQKLSADMADHSNLIRSLLVGAEDARLMRD 
MKTMKSRYMELYDLNRDLLNGYKIRWNNH 
TELLGNLKAVNQAIQRAGRLRVGKPKNQVIT 
ACRDAIRSNNINTLFKJMRVGTASS 


762 


2112 


A 


6079 


2 


2686 


KKA1TCGEKEKQDLIKSLAMLKDGFRTDRGS 

HSDLWSSSSSLESSSFPLPKQYLDVSSQTDISG 

SFGINSNNQLAEKVRLRLRYEEAKRRIANLia 

QLAKLDSEAWPGVLDSERDRJLIL1NEKEELLK 

EMRFISPRKWTQGEVEQLEMARKRLEKDLQ 

AARDTQ SKALTERLKLNSKRNQL VRELEEAT 

RQVATLHSQLKSLSSSMQSLSSGSSPGSLTSSR 

GSLYASSLDSSTSASFTDLYYDPFEQLDSELQ 

SKVEFLLLEGATGFRPSGCnTIHEDEVAKTQ 

KAEGG GRLQ ALRSLSGTPKSMTSLSPRS SLSS 

PSPPCSPLMADPLLAGDAFLNSLEFEDPELSA 

TLCELSLGNSAQERYRLEEPGTEGKQLGQAV 

NTAQGCGLKVACVSAAVSDESVAGDSGVYE 

ASVQRLGASEAAAFDSDESEAVGATRIQIALK 

YDEKNKQFAILIIQLSNLSALLQQQDQKVNIR 

VAVLPCSESTTCLFRTRPLDASDTLVFKEVFW 

VSMSYPALHQKTLRVDVCTTDRSHLEECLGG 

AQISLAEVCRSGERSTRWYNLLSYKYLKKQS 

RELKPVGVMAPASGPASTDAVSALLEQTAVE 

LEKRQEGRS STQTLED S WRYEETSENE A VAE 

EEEEEVEEEEGEEDVFTEKASPDMDGYPALK 

VDKETNTETPAPSPTWRPKDRRVGTPSQGPF 

LRGSTIIRSKTFSPGPQSQYVCRLNRSDSDSST 

LSKKPPFWNSLERRSVRMKRPSPPPQPSSVK 

SLRSERLIRTSLDLELDLQATRTWHSQLTQEIS 

VLKJELKEQLEQAKSHGEKELPQWLREDBRFR 

LLLRMLEKRMDRAEHMGELQTDKMMRAAA 

KDVHRLRGQSCKEPPEVQSFREKMAFFTRPR 

MN1PALSADDV 


763 


2113 


A 


6082 


3 


1558 


PHPIRFSKLCVSFNNQEYNQFCVIEEASKANE 

VLENLTQGKMCLVPGKTRKLLFKFVAKTED 

VGKKmiTSVDlALGr^TGRCVVLNWQGGGG 

DAASSQEALQAARSFKRRPKLPDNEVHWGSII 

IQASTMIISRVPNISVHLLHEPPALTNEMYCLV 
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VTVOciranCTOTD.m/v'-' -T a ot vivihn a xrr taw 

THVTLHGTELCDESYPALLTDIPVGDLHPGEQ 

LEKMLYVRCGTVGSRMFLVYVSYLINTTVEE 

KEIVCKCHKDETVTIErVFPFDVAVKFVSTXF 

EHLERVYADIPFLLMTDLLS ASP WALTEVS SE 

LHUVPSMTTVDQLESQVDNVILQTGESASECF 

CLQCPSLGNIEGGVATGHYIISWKRTSAMBNI 

PinTVTTLPHVIVENIPLHVNADLPSFGRVRES 

LPVKYHLQNKTDLVQDVEISVEPSDAFMFSG 

LKQIRLR1LPGTEQEMLYNFYPLMAGYQQLPS 

LNINLLRFPNFTNQLLRRFIPTSIFVKPQGRLM 

DDTSIAAA 


764 


2114 


A 


4093 


1 


1422 


AAADLANSNAGAAVGRKAGPRSPPSAPAPAP 

PPPAPAPPTLGNNHQESPGWRCCRPTLRERN 

ALMFNNELMADVHFWGPPGATRTVPAHKY 

VlAVGSSVFYAMFYGDLAEVKSEIjhllPDVEPA 

AFLILLKYMYSDEIDLEADTVLATLYAAKKYI 

VFALAKACVNJKLETSLEAKNACVLLSQSRLF 

EEPELTQRCWEVIDAQAEMALRSEGFCEIDR * 

QTLEIIVTREALNTKEAWFEAVLNWAEAEC 

KRQGLPIIPRNKRHVLGRALYLVRIPTMTLEE 

FANGAAQSDILTLEETHSIFLWYTATNKPRLD 

FPLTKRKGLAPQRCHRFQSSAYRSNQWRYRG 

RCDSIQFAVDRRVFIAGLGLYGSSSGKAEYSV 

KIELKRLGVVLAQNLTKFMSDGSSNTFPVVyT 

EHPVQVEQDTFYTASAVLDGSELSYFGQEGM 

TEVQCGKVAFQFQCSSDSTNGTGVQGGQIPE 

LIFYA 


765 


2115 


A 


6099 


1 


1150 


SGFTHYAIYDFIVKGSCFCNVHADQC1PVHGF 

RPVKAPGTFHMVHGKCMCKHNTAGSHCQH 

CAPLYNDRPWEAADGKTGAPNECRTCKCNG 

HADTCHFDVNVWEASGNRSGGVCDDCQHN 

TEGQYCQRCKPGFYRDLRRPFSAPDACKPCS 

CHPVGSAVLPANSVTFCDPSNGDCPCKPGVA 

GRRCDRCMVGYWGFGDYGCRPCDCAGSCD 

PITGDCISSHTDIDWYHEVPDFRPVHNKSEPP 

WEWEDAQGFSALLHSGKCECKEQTLGNAKA 

FCGMKYSYVLK1KILSAHDKGTHVEVNVKIK 

KVLKSTKLKIFRGKRTLYPESWTDRQCTCPIL 

NPGLEYLVAGHED1RTGKUVNMKSFVQHWK 

PSLGRKVMDILKRECK 


766 


2116 


A 


6103 


2 


384 


MTAAATATVLKEGVLEKRSGGLLQLWKRKR 

CVLTERGLQLFEAKGTGGRPKELSFARIKAVE 

CVESTGRHIYFTLVTEGGQEIDFRCPLEDPGW 

NAQITLGLVKFKNQQAIQTVRARQSLGTGTL 

VS 


767 


2117 


A 


6106 


I 


542 


SGSSHASDGSGFQELRICSEDQTPLIAGMCSLP 

MARYYIIKYADQKALYTRDGQLLVGDPVAD 

NCCAEKICTLPNRGLDRTKVP1FLGIQGGSRC 

LACVETEEGPSLQLEDVNIEELYKGGEEATRF 

TFFQS SSGS AFRLEAAA WPG WFLCGPAEPQQ 

PVQLTKESEPSARTKFYFEQSW 


768 


2118 


A 


6109 


3 


292 


FHQAVLQLSSQEARYKAFGTCVSHIGAILAF 

YTPSVISSVMHRVARCAAPHVHILLANFYLLF 

PPMVNPIIYGVKTKQIRDSLGSIPEKGCVNRE 


769 


2119 


A 


6H0 


I 


711 


RHEPSCSNGVASTKSKQNHSKYPAPSSSSSSS 
SSSSSSSPSSVNYSESNSTDSTKSQHHSSTSNQ 
ETSDSEMEMEAEHYPNGVLGSMSTRIVNGAY 
KHEDLQTDESSMDDRHPRRQLCGGNQAATE 
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RIILFGRELQALSEQLGREYGKNLAHTEMLQD 
AFSLIAYSDPWSCPVGQQLDPIQREPVCAAL 
NSAILESQNLPKQPPLMLALGQASECLRLMA 
RAGLGSCSFARVDDYLH 


770 


2120 


A 


6125 


2 


570 


YFGLNLHVQHLGNNVFLLQTLFGAVJLLANC 
VAPWALKYMNRRASQMLLMFLLAICLLAIH 7 
WQEMQMLRBVLATLGLGASALANTLAFAH 
GNEVIPTIIRARAMGINATFANIAGALAPLMM 
ILSVYSPPLPWITYGVFPFISGFAFLLLPETRNK 
PLFDT1QDEKHERKDFREPKQEDPRVEVTQF 


771 


2121 


A 


6126 


909 


353 


RSFVLDTASAICNYNAHYKl^HPKYWCRGYF 

RDYCNIIAFSPNSTNHVALRDTGNQLIVmSC 

LTKEDTGWYWCGIQRDFARDDMDPTELIVT 

DDKGTLANDFWSGKDLSGNKTRSCKAPKW 

RKADRSRTSILHCILrrGLGnSVISHLTKRRRS 

QRNIUIVGNTLKPFSRVLTPKEMAPTEQM 


772 


2122 


A 


6148 


7 


810 


FVLGILALSHTISPFMNKFFPASFPNRQYQLLF 

TQGSGENKEEIINYEFDTKDLVCLGLSSIVGV 

WYLLRKHWIANNLFGLAFSLNGVELLHLNN 

VSTGC^LGGLFIYDVFWVFGTNVMVTVAKS 

FEAPIKLVFPQDLLEKGLEANNFAMLGLGDV 

VIPGIFIALLLRFDISLKKNTHTYFYTSFAAYIF 

GLGLTIFIMHIFKHAQPALLYLVPACIGFPVLV 

ALAKGEVTEMFSYEESNPKDPAAVTESKEGT 

EASASKGLEKKEK 


773 


2123 


A 


6161 


3 


1088 


CQPML VTRKNHPKLLLRRTES V AEKMLTN W 

FTFLLYKFIJCESAGEPLFMLYCAIKHQMEKG 

P1DAITGEARYSLSEDKLIRHLIDYKTLTLNCV 

MPENENAPEVPVKGLDCDTGTQAKEKLLDA 

AYKGVPYSQRPKAADMDLEWRQGRM ARIEL 

QDEDVTTKIDNDWKRLNTLAHY QVTDGSS V 

ALVPKQTS AYNISNS STFTKSLSRYESMLRTA 

SSPDSLRSRTPMTTPDLESGTKLWHLVKNHDH 

LDQREGDRGSKMVSEIYLTRLLATKGTLQKF 

VDDLFETnFSTAHRGSALELAIKYMFDFLDEQ 

ADKHQIHDADVRHTWKSNCLPLRFWVNV1K 

NPQFVFDIHKNSITDACLSVV 


774 


2124 


A 


6163 


860 


125 


KTAVKXR^NPWNETLRYSVPQAELQGRVL 

SLSVWHRESLGRNIFLGEVEVPLDTWDWGSE 

PTWLPLQPRVPPSPDDLPSRGLLALSLKYVPA 

GSEGAGLPPSGELHFWVKEARDLLPLRAGSL 

DTYVQCFVLPDDSRASRQRTRVVRRSLSPVF 

NHTMVYDGFGPADLRQACAELSLWDHGALA 

NRQLGOTRLSLGTGSSYGLQVPWMDSTPEEK 

QL WQ ALLEQP CE WVDGLLPLRTNL APRT 


775 


2125 


A 


6191 


2 


392 


ARGIGSLGRDHSGSGGGTGMAGAWVRKAAD 
YVRSKDFRDYLMSTHFWGPVANWGLPIAAIT 
DMKVKSPEIISRRMTFAL* CYSLTFVRFAH YVQ 
\PW^^VLMLGCI^TAVDFDQLISSMPCISHG^^T 
ASASAL 


776 


2126 


A 


6217 


1 


827 


FRGYWGVREAFTDASWSGGLGPGKPGMKIT 
RQKHAKKHLGFFRNNFGVREPYQILLDGTFC 
QAALRGRIQLREQLPRYLMGETQLCTTRCVL 
KELETLGKDLYGAKLIAQKCQVRNCPHFKNA 
VSGSECLLSMVEEGNPHHYFVATQDQNLSVK 
VKKKPGWLMFHQNTMVLDKPSPKTIAFVKA 
VESGXRLSQCMRKKVSN1SKRNRV* *KTLNRG 
RRKKRKIQSGPNPLSCLECKKKKAPDTQSSASE 
KXRKRKRIRNRSNPKVLSEKQNAEGE 
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M=Methionine, N=Asparagine, P=Proline, 
Q=Glutamine, R=Arginine, S=Serine, 
^Threonine, V=Valine, W=Tryptophan, 
Y»Tyrosine, XHUnknown, *=Stop codon, 
/=possible nucleotide deletion, \=possiblc 
nucleotide insertion 


777 


2127 


A 


6236 


1038 


1402 


YYOJSSLPSlVGNGlEkWmCIFU^^ 
FQPk?RPRGGGHLRSGVLCK}PGQHraETP/SFF 
YNSKISPALWGPPVIPSALGGEAGKSL*PRRQ 
RFQRGGIAPLPSRVRGRAKLFLKKK 


77o 


2128 


A 


6237 


422 


913 


ASFFHHHRGAFLLLLAIPGS*GQDQSLIHWSN 
AVSNAD\LLDLKVN*LDH\LEEKMPL\EVKVVP 
PQVL\SEPN*RSGGCFSAPSFEVPPWTGEVKP/ 
SPQRDGGALG\QGPLGPSDSILAIXKKQT*RA 
LLNWPLGSLRRSSCFGGQDGQDLKPRSGLGC 
NSFRYRR 


779 


2129 


A 


6249 


420 


36 


ARAPSPSFSVRDVELSDPARERGEMPVAVGP 

YGQSQPSCFDRVKMGFVMGCAVGMAAGAL 

FGTFSCLSS1LVSSSG/SGMRGRELMGGIGKTM 

MQSGGTFGTFMAIGMGIRC*PWLPTTSVPSH 

QSQPMY 


780 


2130 


A 


6263 


415 


1380 


RIMRMCDRGIQMLITTVGAFAAFSLMTIAVG 

TDYWLYSRGVCRTKSTSDNETSRKNEEVMT 

HSGLWRTCCLEGAFRGVCKKIDHFPEDADYE 

QDTAEYLLRAVRASSVFPILSVTLLFFGGLCV 

AASEFHRSRHNVILSAGIFFVSAGLSNIIGIIVYI 

S\ANAGRTPGQR\DSKKSYSYGWSF/YFSGAFS 

FUGR/HC* GVGLP WHIYIEKHQQLRAKSHSEF 

LKKSTFARLPPYRYRFRRRSSSRSTEPRSRDLS 

PISKGFHTIPSTDISMFTLSRDPSKITMGTLLNS 

DRDHAFLQFHNSTPKEFKESLHNNPANRRTT 

PV 


781 


2131 


A 


6274 


832 


318 


RIIK VKDLKQTLADCTAYPRCKCLVEMDQIFH " 

LQVKQKQLACLCTWQARDPDCPPSTKWL/L 

VGPGMGCMVALFQDSIAWSNKSMPSSLSAIS 

QSPCQVQAPEGPSSFHLPILSFTTCLSWQGGD 

LEFLGDLKGCSELKNFQEUTQSALVHPKADV 

WWYCGRPLLGTLPSN 


782 


2132 


A 


6281 


1324 


393 


WISLPSSLLCRKNGSSAEDDRR\GEPSAEEAEG 

EREDWU1GSA*SVGAVSKVFSARF*RTYPS\E 

DEEEVTHQKSSSSDSNSEEHRKKKTSRSRNK 

KKRKNKSSKRKHRKYSDSDSNSESDTNSDSD 

DDKKRVKAXKKKKKKKHBCTKKKKN^ 

ESSDSSCKDSEEDLSEATW'MEQPNVADTMDL 

IGPEAPnHTSQDEKPLKYGHALLPGEGAAMA 

EYVKAGKKJPRRGEIGLTSEEIGSFECSGYVM 

SGSRHRRMEAVRLRKENQIYSADEKRALASF 

NQEERRKRESKILASFREMVHKKTKGKDDK 


783 


2133 


A 


6305 


201 


1032 


WDDYPQGALRRREAAEGLHFLGPPGRVRGQ 

LRGITGPAWYCHSPSHSLLSAFCHLPTPSRCP 

AMARPPVPGSVVVPNWHES/RRGQGVPGLHS 

AQEPPAGVWAA*AASAAAA\LSIDTASYKIFV 

SGKSGVGKTALVAKLAGLEVPWHHETTGIQ 

TTVWWPAKLQASSRVVMFRFEFWDCGESA 

L?JbTUnMLLt\\*MnN I DArLrLrSFTDRASFE 

DLPGQLARIAGEAPGWRMVIGSKFDQYMHT 

DVPERDLTAFRQAWELPLLRVKSVPGRRLG 


784 

- 


2134 


A 


6308 


86 


96 


GSSPDPASUTMKNQDKKNGAAKQSNPKSSP~ 

GQPEAOPEGAQERPSQAAPAVEAEGPGSSQA 

PRKPEGAQARTAQSGALRDVSEELSRQLEDIL 

STYCVDNNQGGPGEDGAQGEPAEPEDAEKSR 

TYVAKNGEPEPTPVVNGEKEPSKGDPNTEEIR 

QSDEVGDIU)HRRPQEKKKAKGLGKEITLLM 

QTLNTLSTPEEKLAALCKKYAELLEEHRNSQ 

KQMKLLQKKQSQLVQEKDHLRGEHSKAVLA 
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RSKXESLCRELQRHNRSLKEEGVQRAREEEE 

KRKEVTSHFQVTLNDIQLQMEQHNERNSKLR 

QENMELAERLKKLIEQYELJIEEHroKVFKHK 

DLQQQLVDAKLQQAQEMLKEAEERHQREKD 

FLLKEAVESQRMCELMKQQETHLKQQLALY 

TEKFEEFQNTLSKSSEVFTTFKQEMEKMTKKI 

KKLEKETTMYRSR WES SNKALLEMAEEKTV 

RDKELEGLQ VKIQRLEKLCRALQT/G AQ* PVR 

GQRWGSHRTSAVRIFS 


785 


2135 


A 


6319 


1493 


889 


SPQGPLLRSVSPVSAGASSVTPGGAQPGVTTT 

PPSLVAV APAPGSAAGPAAGWO*HAGCR/WT 

KIPWSWGMRFMKIFFSEEYRSISTRISHDAL* 

EKCTQPAKPLSMIRXTGSSVSPG/PLVKWNWT 

RREFRNSGTRWSSCCGMSCMYSFLGHCSV/S 

QDLPLVHVDVGWQPPLGPTVGLRPGLLPLHD 

TTPCQKLWDDLDWA 


786 


2136 


A 


6320 


551 


135 


RWLPVAECDSSCVGCTGEGPGNCKECISGYA 
REHGQCADVDECSLAEKTCVRKNENCYNTP 
GSYVCV CPDGFEET/RRCLCAAGRG* SHRRRK 
PDTAALPRRPVMCRTYPLKYSEGCPVENVAL 
RMPSPAVDSGGERLPAL 


787 


2137 


A 


6330 


1693 


227 


DYVLTAELHRQRSPGVSFGLSVFNtMNAIMG 

SGDLGLAYVMANTGVFGFSFLLLTVALLASYS 

VHLLLSMCIQTAYLGP*TNYFMVLPAH*LTCL 

PLIEFLQSL*NSL\*AVTSYEDLGLFAFGLPGKL 

WA fiTIITONIG AMSS YLLIIKTELPAAIAEFLT 

GDYSRYWYLDGQTLLII1CVGIVFPLALLPKIG 

FLG YTS SLSFFTMMFFAL VVTIKK WSIPCPLTL 

NYVEKGFQISNVTDDCKPFCLFHFSKESAYALP 

TMAFSFLCHTSILPIYCELQSPSKKRMQNVTN 

TAIALSFLIYF1SALFGYLTFYD/GTTKAQRGE 

VTCHRIKDKVESELLKG***IP*SHDVVV\mV 

KLCILFAVLLNTWLIHFPARKAVTMMFFSNFP 

FSWIRHFLrTLALNinVLLAIYWDIRNWGVV 

GASTSTCUFIFPGLFYLKLSREDFLSWKKLGV 

GCFCyLLSFKTSILRNSLSVYinJPASRKSrYFKI 


788 


2138 


A 


6351 


1 


6622 


PRSLCFSLWAEAAVLADGGLRRRRRLLRGTM 

SASFVPNGASLEDCHCNLFCLADLTGIKWKK 

YVWQGPTSAPILFPVTEEDPILSSFSRCLKADV 

LG/VWRRDQRPERRE\L* IF WGGEDPWLLTLF 

TNnYQKKKMECGRMDFPMNAVLCFSKAVH 

NLLERCLMN1WFVRIGKWFVKPYEKDEKPIN 

KSEHLSCSFTFFLHGDSNVCTSVEINQHQPVY 

LLSEEHTTLAQQSNSPFQVILCPFGLNGTLTGQ 

AFKMSDSATKKLIGEWKQFYPISCCLKEMSE 

EKQEDMDWEDDSLAAVEVLVAGVRMIYPAC 

FVLVPQSDIPTPSPVGSTHCSSSCLGVHQVPAS 

TRDPAMSSVTLTPPTSPEEVQTVDPQSVQKW 

VKFSSVSDGFNSDSTSHHGGKIPRKLANHVV 

DRVWQECNMNRAQNKRKYSASSGGLCEEAT 

AAKVASWDFVEATQRTNCSCLRHKNLKSRN 

AGQQGQAPSLGQQQQILPKHKTNEKQEKSEK 

PQKRPLTPFHHRVSVSDDVGMD\ADS\ASQRL 

VMS AP\D S Q\ VRFSNIRVTNDV AKVTPQMHGTE 

MANSPQPPPLSIAIIPCDVVDEGVTKTPSTPQS 

QHFYQMPTPDPLVPSKPMEDRIDSLSQSFPPQ 

YQEAVEPTVYVGTAVNLEEDEANIAWKYYK 

FPKKKDVEFLPPQLPSDKFKDDPVGPFGQESV 

TSVTELMVQCKKPLKVSDELVQQYQIKNQCL 

SAIASDAEQEPKJDPYAFVEGDEEFLFPDKKD 
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MSLFSPSKQDAI^^ 

VSYTDLDNLFNSDEDELTPGSKRSANGSDDK 

ASCKESKTGNLDPLSC1STADLHKMYPTPPSL 

EQHIMGFSPMNMNrTOYGSMDTOGGTVLE 

GNSSSIGAQFKIEVDEGFCSPKPSEIKDFSYVY 

KPENCQUVGCSMFAPLKTLPSQYLPLIKLPEE 

CIYRQSWTVGKLELLSSGPSMPFIKEGDGSNM 

DQEYGTAYTPQTHTSCGMPPSSAPPSNSGAGI 

LPSPSTPRFPTPRTPRTPRTPRGAGGPASAQGS 

VKYENSDLYSPASTPSTCRPLNSVEPATVPS1P 

EAHSLYVNLILSESVMNLFKDCNSDSCaCVC 

NMNIKGADVGVYIPDPTQEAQYRCTCGFSAV 

MNRKFGNNSGLFFEDELDI1GRNTDCGKEAE 

KRFEALRATSAEHVNGGLKESEKLSDDLILLL 

QDQCTNLFSPFGAADQDPFPKSGVISNWVRV 

EERDCCNDCYLALEHGRQFMDNMSGGKVDE 

ALVKSSCLHPWSKRNDVSMQCSQDILRMLLS 

LQPVLQDAIQKKRTVRPWGVQGPLTWQQFH 

KMAGRGSYGTDESPEPLPIPTFLLGYDYDYLV 

LSPFALPYWERLMLEPYGSQRDIAYWLCPE 

NEALLNGAKSFFRDLTAIYESCRLGQHRPVSR 

LLTDGIMRVGSTASKKLSEKLVAEWFSQAAD 

GNNEAFSKLKLYAQVCRYDLGPYLASLPLDS 

SLLSQPNLVAPTSQSHTPPQMTNTGNANTPS 

ATLASAASSTMTVTSGVAISTSVATANSTLTT 

ASTSSSSSSNLNSGVSSNKLPSFPPFGSMNSNA 

AGSMSTQANTVQSGQLOGQQTSALQTAGISG 

ESSSLPTQPHPDVSESTMDRDKVGIPTDGDSH 

AVTYPPAIYVYIIDPFTYENTDESTNSSSVWTL 

GIXRCFLEMVQTLPPHIKSTVSVQUPCQYLLQ 

PVKHEDREIYPQHLKSLAFSAFTQCRRPLPTS 

TNVKTLTGFGPGLAMETALRSPDRPECIRLYA 

PPFILAP\aCDKQTELGETFGEAGQKYNVLFV 

GYCLSHDQRWHASCTDLYOELLETaiNIDVP 

NRARRKKSSARKFGLQKLWEWCLGLVQMSS 

LPWRWIGRLGRIGHGELKDWSCLLSRRNLQ 

SLSKRLKDMCRMCGISAADSPSILSACLVAM 

EPQGSFVIMPDSVSTGSVFGRSTTLNMQTSQL 

NIPQDTSCTHILVFPTSASVQVASATYTTENL 

DU\FNFNNDGADGMGIFDLLDTGDDLDPDfl 

NILPASPTGSPVHSPGSHYPHGGDAGKGQSTD 

RIXSTEPHEEVPNILQQPLALGYFVSTAKAGP 

LPDWFWSACPQAQYQCPLFLKASLHLHVPSV 

QSDELLHSKHSHPLDSNQTSDVLRFVLEQYN 

ALSWLTCDPATQDRRSCLPIHFWLNQLYNFI 

MNML 


789 | 


2139 


A 


6359 


1 


2002 


TGUTEDGLDVMGWPLKGQAFLPLVPEPRR 

LPVGPLLRALATCHALSRLQDTPVGDPMDLK 

MVESTGWVLEEEPAADSAFGTQVLAVMRPP 

LWEPQLQAMEEPPVPVSVLHRFPFSSALQRM 

SVWAWPGATQPEAYVKGSPELVAGLCNPET 

VPTDFAQMLQSYTAAGYRVVALASKPLPSVP 

SLEAAQQLTRDTVEGDLSUGLLVMRNLLKP 

QTTPVIQALRRTRIRAVMVTGDNLQTAVTVA 

RGCGMVAPQEHLUVHATHPERGQPASLEFLP 

MESPTAWGVKDPDQAASYTVEPDPRSRHLA 

LSGPTFGnVKHFPKLLPKVLVQGTVFARMAP 

EQKTELVCELQKLQYCVGMCGDGANDCGAL 

KAADVGISLSQAEASWSPFTSSMASIECVPM 
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Amino acid sequence (A«Alanine OCysteine, 
D=Aspartic Acid, IMHutanric Acid, 
F=Phenylalanine, Q=0!ycine, HNHistidine, 
I=Isoleucine, K=Lysine, L=Leucine f 
M=Methionine, N=Asparagine, P=Proiine, 
Q=Glutamine, R=Arginine, S=Serine, 
T=Threonine, V^Valine, W~Tryptophan, 
Y-Tyrosine, X«Unknown, *-Stop codon, 
/^possible nucleotide deletion, V=possible 
nucleotide insertion 














^lfeEGRCSLDTSF§VFKYlviALVsLTQFISVLIL 
YTOHNLGDLQFLAIDLVnTITAVLMSRTGP 
ALVLGRVRPPGALLSVPVLSSLLLQMVLVTG 
VQLGGYFLTLAQPWFVPLNRTVAAPDNLPNY 
ENTWFSLSSFQYLILAAAVSKGAPFR\RPLTN 
NVPFLLASAL* SSVLVVLVLSPGLLHGFLALR 
NTTDTGFKLLLVGLVTLNFVGGLHAGERARP 
VPPRLPAPPPAQAG\SKKRFKQLERELAEQPW 
PPLPAGPLR 


/yv 




A 

A 




if* 




SSAGSARKLOVMALAARLWRLLPFRRGAAP 

GSRLPAGTSGSRGHCGPCRFRGFEVMGNPGT 

FKRGLLLSALSYLGFETYQVISQAAWHATA 

KVEEHJBQADYLYESGETEKLYQLLTQYKESE 

DAELLWRLARASRDVAQLSRTSEEEKKLLVY 

EALEYAKRA/UEKNESSFASHKWYAICLSDV 

GDYEGIKAKIANAYIIKEHFEKAIELNPKDATS 

1HLMGIWCYTFAEMPWYQRRIA*NACLQLPP 

* FPP YEKALGNYFHRAEQVDPNFY SKNLLLLG 

KTYLKLHNKKLAAFWLMKAKDYPAHTEED 

KQIQTEAAQLLTSFSEKN 


791 


2141 


A 


6434 


3 


1460 


1ALLIVDGLAWDDQGGLALLHISPSKLIL*QDS 

SGMS/YVMVRCTITRAFFKSLLCHICX)YSIGPQ 

* VTVCPGQDACKE*KSTAN* GG*RE** PQ VLFF 

AFLSNPAVKFGRMSKKQRDSLYAEVQKHQQ 

RLQEQRQQQSGEAEALARVYSSSISNGLSNLN 

NETSGTYANG SVIDLPKSEGYYNWSGOPSP 

IX3SGLDMT\GIKQIKQEPIYDLTSVPNLFTY\SS 

FNN\GQLAPGIT\MTEIDRrAQNIIKSHLETCQY 

TMEELHQLAWQTHTYEEIKAYQSKSREALW 

QQCAIQITHAIQYWEFAJCRITGFMELCQNDQ 

ILLLKSGCLEWLVRMCRAFNPLNNTVLFEG 

KYGGMQMFKALGSDDLVNEAFDFAKNLCSL 

QLTEEEIALFSSAVLISPDRAWLDEPRKVQKLQ 

EKIWALQHVIQKNHLDDETLAKLIAKIPTITA 

VCNLHGEKLQVFKQSHPEIVNTLFPPLYXELF 

NPDCATACK 


792 


2142 


A 


6440 


92 


781 


SRGTFRCFCRDFFPCFSNMRLFLWNAVLTLFV 

TSLIGAL1PEPEVKIEVLQKPFICHRKTKGGDL 

MLVHYEGYIJEKDGSLFHSTHKHNNGQPIWFT 

LGILEALKGWGPGA* K/DMC VGEKRKLHPPA 

LGYGKEGKGKIPPESTLIFNIDLLEIRNGPRSH 

ESFQEMDLNDDWKLSKDEVKAYLKKEFEKH 

GAVVNESHHDALVEDIFDKEDEDKDGF1SAR 

EFTYKHDEL 


793 


2143 


A 


6446 


3201 


152 


PRLKRLVVTEEDGGARPEALGK1APRTPAELG 

ARADQELVTALMCDLRRPAAGGMMDLAYV 

CEWEKWSKSTHCPSWIACAWSCRNLIAFTM 

DLRSDDQDLTRMIHILDTEHPWDLHSIPSEHH 

EArrOLEWDQSGFPGFLFSRWPTOQTOCWS 

MGVSTLA\NSWE\SSVGSL\VEGGPHLWALS\ 

WLH\NGVKLALHVEKSGASSFGEKFSR\VKFS 

P\SLTLF\GGNAMEGWIAVTVSGLVTVSLLQ\P 

SGQVL\TST\ESLCRJLRARVALADIAFTG GGNI 

VVATADGSSAVSPVQFYKVCVSVVSEKCRIDT 

DILPSLrT4RCTTDLNRKDKFPAlTHLKFLARD 

MSEQVLLCASSQTSSIVECWSLRKEGLPVNNI 

FQQISPWGDKQPTILKWRILSATNDLDRVSA 

VULPKXPISLTNTDLKVASDTQFYPGLGLAL 

AFHDGSVHIVHRLSLQTMAVFYSSAAPRPVD 

EPAMKJIPRTAGPAVHLKAMQLSWTSLALVG 



243 



WO 01/57188 PCT/US01/03800 



SEQ ID 
NO: of 
nucl- 
eotide 
seq- 
uence 


SEQ ID 
NO: of 
peptide 
seq- 
uence 


' Met 
hod 


SEQ 
ID NO: 
in 

USSN 
09/496 
914 


Predicted 

beginning 
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Amino acid sequence (A=Alarune OCysteine, 
D=Aspartic Acid, E=Glutamic Acid, 
^Phenylalanine, GKHycine, H«Histidme, 
Msoleucinc, KHLysine, L=Leucinc, 
M^Methionine, N=Asparagine, P=Proline, 
Q=01utamine, R«Arginine, S=Serine, 
^Threonine, V=Valine, W=Tryptophan, 
Y=Tyrosine, X=Unknown, *=Stop codon, 
/=possible nucleotide deletion, \=possible 
nucleotide insertion 














iPSHGKLSV\IiU^PSMQHPI£y G^ " 

L£YCMVTGYDWWD1LLHVQPSMVQSLVEKL 

HEEYTRQTAALQQVLSTRILAMKASLCKLSP 

CTVTRVCDYHTKLFLIA1SSTLKSLLRPHFLNT 

PDKSPGDRLTEICTKITDVDIDKVMINLKTEEF 

VLDMNTLQALQQLLQWVGDFVLYLLASLPN 

QPCPTSEPCPTSEPSPTSEPSPTSEPSSP*SLaG 

SLUUK}HSFLRIXiTSLGMLRELMVVIRrWGLL 

KPSCLPVYTATSDTQDSMSLLFRLLTKLWICC 

RDEGPASBPDEALVDECCLLPSQLLIPSLDWL 

PASDGLVSRLQPKQPLRLQFGRAPTLPGSAAT 

LQLDGLARAPGQPKJDHLRRLHLGACPTEEC 

KACTRCGCVTMLKSPNRTTAVKQWEQRWiK 

NC/LVRWALVAGAPQUPLSPAAPQLLLSYPSA 

APEPGCCKSHRSPWTLLGAVNLSPPCRAVEG 

RGPDACVTSRASEEAPAFVQLGPQSTHHSPRT 

PRSLDHLHPEDRP 


794 


2144 


A 


6490 


418 


585 


NGDKADLENESCRAQVLMPWPALWEAEGG 
GSDBPRDLRLQ*AVTIPL\TPAWVTQ 


795 


2145 


A 


6499 


395 


1027 


KLLWLPPHSEQKRSPLYHPQGPSGTTPSAPfr 

SHSPPPSLLQA\PSIAAFLRTHGHISASGPLRMP 

FPH/H*NAFLLVFPGQRSQLTS/PSHYLCREVFP 

DHHHHLCRLSLES SPLFHHRVLFCVPKQNVN 

STRAQIFCLFVHTVGCRCINTFPLHLFRLHL'WL 

HFLQIPLGKKNKSVKLGKTWGRGCQSAAGS 

DTRVRAAVGAPGLPVEPLV 


796 


2146 


A 


6503 


68 


936 


HSALLTHSSFCVFTLCQDFFTYSSMSEEVTYA 

DLQFQNSSEMEKIPEIGKFGEKAPPAPSHVWR 

PAALFLTLLCLLLLIGLGVLASMFHVTLK1EM 

KKMNKLQNTSEELQRNISLQLMSNMNISNKER. 

NLSTTLQTIATKLCRELYSKEQEHKCKPCPRR 

WIWHKDS CYFLSDD VQTWQESKMACAAQN 

ASLLKINNKNALEFDCSQSRSYDYWLGLSPEE 

DS/YSWYESG*YNQ\PSAWV1RNAPDLKNMY 

CGYINRLYVQYYHCTYKQRMCEKMANPVQ 

LGSTYFREA 


797 


2147 


A 


6507 


1 


881 


PGSTHASARSQVPRSAGEAAPHSRRPPGLLPH 

APRAASAQLEEBMRDPHPGMTLQEGDCRGS 

QTVSLTMGTADSDEMAPEAPQHTHIDVHIHQ 

ESALAKLLLTCCSALRPRATQARGSSRLLVAS 

WVMQIVLGILSAVLGGFFYIRDYTLLVTSGA 

AIWTGAVAVLAGAAAFIYEKRGGTYWALLR 

TLLALAAFSTAJAALKLWNEDFRYGYSYYNS 

ACRISSSSDWNTPAPTQSPEEVRRLHLCTSFM 

DMLKALFRTLQAMLLGYWILLLLASLTPLWL 

/SL/RGECSQPKG*VPKKRDQKEMLEVSGI*PG 

STHASARSQVPRSAGEAAPHSRRPPGLLPHAP 

RAASAQLEERMRDPHPGMTLQEGDCRGSQT 

V bL 1 Mb 1 AIJ&DbMAPEAPQHTHlDVHiHQES 

ALAKLLLTCCSALRPRATQARGSSRLLVASW 

VMQ1VLG1LSAVLGGFFYIRDYTLLVTSGAAI 

WTGAVAVLAGAAAFIYEKRGGTYWALLRTL 

LALAAFSTAIAALKLWNEDFRYGYSYYNSAC 

RISSSSDVWOTAPTQSPEEVRRLHLCTSFMDM 

LKALFRTLQAMLLGVmLLLASLTPLWLYC 

WRMFPTKGVSP 


798 


2148 


A 


6528 


912 


2287 


VPNYLPSVSSAIGGEVPQRYVWRFCIGLHSAP 
RFLVAFAYWNHYLSCTSPCSCYRPLCRLNFG 
LNWENLALLVLTYVSSSEDF/TWVPG*GRSG 
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correspondi 

ng to first 

amino acid 

residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue 
of peptide 
sequence 
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D~Aspartic Acid, E=01utamic Acid, 
F=PhenylaIanine, G=Glycine, H=Histidine, 
I=Isoleucine, K=Lysine, L=Leucine, 
M^Methionine, N=Asparagine, P=Proline, 
Q=<jlutamine, R=Arginine, S=Serine, 
T-Threoninc, V-Valine, W-=Tryptophan, 
Y^Tyrosine, X^Unknown, *=Stop codon, 
/=possib!e nucleotide deletion, V=possible 
nucleotide insertion 














EVFPEGTGLPLPHSDLPTSWCGHSLQCGSQSS 

FPPAIHENAFIWIASSLGHMLLTCILWRLTKK 

HTVSQE\DGLSLAGAPRQPRRKSRTSVLRIRV 

MVRWELSSNGNPGRGVLGLGLGLGNKLRW 

GQNLGL*HCVWWWETGE* KRWRLQMGDS* 

GVASRRQ*VRNSVRGLVCHNSSAPPMYMGFF 

SPTVFGGGVGG* LHVTFILHPPEVEAAGIPLLL 

GPSLPQRQGREIIIWILAAPACAPFHDR*WEP 

REIRPSP*ELGLRGEPTLSYPASCRVIRQPIP*D 

RKSYSWKQRLFIINF1SFFSALAVYFRHNMYC 

EAGVYTIFAILEYTVVLTNMAFHMTAWWDF 

GNKELLITSQPEEKRF 


799 


2149 


A 


6529 


1 


874 


FFFFQRINFIEHSGSVSLLALACDLGWCEDWS 

CCLVQGGGDLVDWQTNHGEDEAGGDTDSV 

DEARCKESQQEAQENLREDLCLESFAKDKIL 

QUEGSEREHEETRTKQAALDGEPLGGGQLTA 

VHLHPSKEQQGQEGGERQRGARTHHWRGW 

EKGRRVRLRPPSGKLRADQPVRKJLGGPTPS/T 

ELPGLOPH APTPHT A/PATPTYSP APDTPNPPV 

RWKCPLPVEPRTRQLCRERTRKACPPKPRPPL 

GLPGDPTGPVTHHAPPVSPTGASGQERRAEP 

GAVSYAHASATK 


800 


2150 


A 


6544 


2 


662 


SAQRWAAVAGRWGCRLLALLLLVPGPGGAS 
EITFELPDNAKQCFYEDLAQGTKCTLEFQVTTG 
GHYDVDCRLEDPDGKVLYKEMKKQYDSFTF 
TASKNGTYKFCFSNEXFSTFTHKTVYFDFQVG 
EVTHLCFLVR/DRVSALTQMESACVSIHEALKS 
VIDYQTHFRLREAQGRSRAEDLNTRVAYWSV 
GEALILLVVSIGQVFLLKSFFSDKRTTTTRVGS 


801 


2151 


A 


6556 


1 


1319 


TPCMECIKGEGLREPQNLSGSQREPQTEGSM 

DGWRRMPRWGLLLLLWGSCTFGLPTDTTTF 

KRIFLKRMPSIRESLKERGVDMARLGPEWSQP 

MKRLTLGNTTSSVILTNYMDTQYYGEIGIGTP 

PQTFKWFDTGSSNVWVPSSKCSRLYTACVY 

HKLFDASDSSSYKHNGTELTLRYSTGTVSGFL 

SQDUTVGGrrVTQMFGEVTEMPALPFMLAEF 

DGVVGMGFIEQAIGRVTPIFDNnSQGVLKED 

VFSFYYNRDSENSQSLGGQIVLGGSDPQHYE 

GNFHYINLIKTGVWQIQMKGVSVGSSTLLCE 

DGCLALVDTGASYISGSTSSIEKLMEALGAKE 

KRLFDYWKCNEGPTLPPTFLFLLGGKDTTPLT 

SADYLFQESYSSKKJLSTLAIHAMYIPPPTXjPTL 

\ALGATF\IRKFYTEFDRGNNPHGFALAR 


802 


2152 


A 


6567 


13 


6147 


MCLGRMGASSPRSPEPVGPPAPGLPFCCGGSL 

LAVWLLALPVAWGQCNAPEW\LPFARPTNL 

TDEFEFPIGTYLNYECRPGYSGRPFSHCLKNS 

WTGAKDRCRRKSCRNPPDPVNGMVHVIKG 

IQFGSQIKYSCTKGYRLIGSSSATCIISGDTVIW 

DNETPICDRIPCGLPPTITNGDFISTNRENFHY 

GSWTYRCNPGSGGRKVFELVGEPSIYCTSND 

DQVGIWSGPAPQCIIPNKCTPFNVENGILVSD 

NRSLFSLNEWEFRCQPGFVMKGPRRVKCQA 

LNKWEPELPSCSRVCQPPPDVLHAERTQRDK 

DNFSPGQEVFYSCEPGYDLRGAASMRCTPQG 

DWSPAAPTCEVKSCDDFMGQLLNGRVLFPV 

NLQLGAKVDFVCDEGFQLKGSSASYCVLAG 

MESLWNSSVFVCEQIFCPSPPVIPNGRHTGKP 

LEVFPFGKAVNYTCDPHPDRGTSFDLIGESTIR 

CTSDPQGNGVWSSPAPRCGILGHCQAPDHFL 

FAICLKTQTNASDFPIGTSLKYECRPEYYGRPF 
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Y«=Tyrosine, X=Unknown, *«Stop codon, 
A=possible nucleotide deletion, \=possible 
nucleotide insertion 














$rTq^p>n^vwgsPgnvrypy,gryTPPnp i 

MVEM^ ■ 

LSGNAAHWSTKPP1CQRIPCGLPPTIANGDFIS 

TNRENFHYGSVVTYRCNPGSGGRKVFELVGE 

PSlYCTSNDDQVGIWSGPAPQClIPNKCrPPNV 

ENGELVSDNRSLFSLNEWEFRCQPGFVMKGP 

RRVKCQALNKWEPELPSCSRVCQPPPDVLHA 

ERTQRDKDNFSPGQEVFYSCEPGYDLRGAAS 

MRCTPQGDWSPAAPTCEVKSCDDFMGQLLN 

GRVLFPVNLQLGAKVDFVCDEGFQLKGSSAS 

YCVLAGMESLWNSSVPVCEQIFCPSPPV1PNG 

RHTGKPLEVFPFGKAVNYTCDPHPDRGTSFD 

UGESTIRCTSDPQGNGVWSSPAPRCGILGHC 

QAPDHFLFAKLKTQTNASDFP1GTSLKYECRP 

EYYGRPFS1TCLDNLVWSSPKDVCKRKSCKTP 

PDPVNGMVHVITDIQVGSRINYSCTTGHRLIG 

HSSAECILSGNTAHWSTKPPICQRIPCGLPPTI 

ANGDFISTNRENFHYGSWTYRCNLGSRGRK 

VFELVGEPSJYCTSNDDQVGIWSGPAPQaiPN 

KCTPPNVENGILVSDNRSLFSLNEWEFRCQP 

GFVMKGPRRVKCQALNKWEPELPSCSRVCQ 

PPPEILHGEHTPSHQDNFSPGQEVFYSCEPGY 

DLRGAASLHCTPQGDWSPEAPRCAVKSCDDF 

LGQLPHGRVLFPLNLQLGAKVSFVCDEGFRL 

KGSSVSHCVLVGMRSLWNNSVPVCEHIFCPN 

PPAILNGRHTGTPSGDIPYGKEISYTCDPHPDR 

GMTFNLIGESTIRCTSDPHGNGVWSSPAPRCE 

LSVRAOHCKTPEQFPFASPTIPINDFEFPVGTS 

LNYECRPG YFGKMF S IS CLENL VWS S VEDNC 

RRKSCGPPPEPFNGMVHINTDTQFGSTVNYSC 

NEGFRLIGSPSTTCLVSGNNVTWDKKAPICEn 

SCEPPPTISNGDFYSNNRTSFHNGTVVTYQCH 

TGPDGEQLFELVGERSIYCTSKDDQVGVWSS 

PPPRCISTNKCTAPEVENA1RVPGNRSFFSLTEI 

IRFRCQPGFVMVGSHTVQCQTNGRWGPKLPH 

CSRVCQPPPEILHGEHTLSHQDNFSPGQEVFY 

SCEPSYDLRGAASLHCTPQGDWSPEAPRCTV 

KSCDDFLGQLPHGRVLLPLNLQLGAKVSFVC 

DEGFRLKGRSASHCVLAGMKALWNSSVPVC 

EQIFCPNPPAILNGRHTGTPLGD1PYGKEVSYT 

CDPHPDRGMTFNL1GEST1RRTSEPHGNGVWS 

SPAPRCELPVGAACPHPPKIQNGHYIGGHVSL 

YLPGMTISYTCDPGYLLVGKGFIFCTDQGIWS 

QLDHYCKEVNCSFPLFMNGISKELBMKKVYH 

YGDYVTLKCEDGYTLEGSPWSQCQADDRWD 

PPLAKCTSRTHDALIVGTLSGTIFFILLIIFLSWI 

ILKHRKGNNAHENPKEVAIHLHSQGGSSVHP 

RTLQTNEENSRVLP 


803 


2153 


A 


6574 


2 


3233 


hgrsarlaavpaeaMpgprrpagsrlrllll 

I .LI PPr 1 1 f r PfJVQJJAnMT t\ja \nn m A-Ktrcnr 
jjijjjrrLii^UL#K\j\onAUJNl( J VAVVLrL ANTSY 

pwswa\rvgpavelalaqvkarpdllpgwt 

vrtvlgssenalgvcsdtaaplaavdlkwe 

hnpavflgpgcvyaaapvgrftahwrvpll 

tagapalgfgvkdeyalttragpsyaklgdf 

vaalhrrlgwerqalmlyayrpgdeehcff 

lveglfmrvrdrlnitvdhlefaeddlshyt 

rllrtmprkgrvtyicsspdafrtlmllalea 

glcgedyvffhldifgqslqggqgpaprrpw 

ergdgqdvsarqafqaakjjtykdpdnpeyl 

eflkqlkhlayeqfnftmedglvntipasfh 
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F-Phenylalanine, G-Grycme, H=Histidine, 
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Q^Glutamine, R=Arginine, S=Serine, 
T=Threonine, V=Valine, W=Tryptophan, 
Y=Tyrosine, X=Unknown, *=Stop codon, 
/^possible nucleotide deletion, N^possible 
nucleotide insertion 














DGLLLYIQAVTETLAHGOTVTDGBNITQRMW 

NRSFQGVTGYLKIDSSGDRETDFSLWDMDPE 

NGAFRWLNYNGTSQELVAVSGRKLNWPLG 

YPPPDIPKCGFDNEDPACNQDHLSTLEVLALV 

GSLSLLGOJVSrTIYRKMQLEKELASELWRVR 

WEDVEPSSLERHLRSAGSRLTLSGRGSNY GSL 

LTTEGQFQVFAKTAYYKGNLVAVKRVNRKR 

EELTRKVLFELKHMRDVQNEHLTRFVGACTD 

PPNICILTEYCPRGSIX^DILENESiTLDWMFRY 

SLTNDIVKGMLFLHNGATCSHGNLKSSKCVV 

DGRFVLKITDYGLESFRDLDPEQGHTVYAKK 

LWTAPELLRMASPPVRGSQAGDVYSFGIILQE 

tat w SGV FH V KCiLDLSPKElJLERVTRGEOPPFR 

PSLALQSHLEELGLLMQRCWAEDPQERPPFQ 

QIRLTLRKFNRENS SNILDNLLSRMEQ YANNL 

EELVEERTQAYLEEKRKAEALLYQILPHSVAE 

QLKRGETVQAEAFDSVTIYFSDIVGFTALSAE 

STPMQWTLLNDLYTCFDAVIDNFDVYKVET 

IGDAYMWSGLPVRNGRLHACEVARMALAL 

LDAVRSFRIRHRPQEQLRLRIGIHTGPVCAGV 

VGLKMPR YCLFGDTVNTASRME SNGEAL\KI 

HLSS\ETKAVL\EEFGGFELELRGDVEMKGKG 

KVRTYWLLGERGSSTRG 


804 


2154 


A 


6585 


2 


3837 


DAPGRPPVRLPTMELEDGWYQEEPGGSGAV 

MSERVSGLAGSIYREFERLIVRYDEEVVKELIP 

LWAVLENLDSVFAQDQEHQVELELLRDDNE 

QLITQYEREKALRKHAEEKFIEFEDSQEQEKK 

DLQTRVESLESQTRQLELKAKNYADQISILEE 

REAELKKEYNALHQRHTEMIHNYMEHLERT 

KLHQLSGSDQLESTAHSRIRXERPISLGFFPLP 

AGDGLLTPDAQKGGETPGSEQWKFQELSQPR 

SHTSLKDELSDVSQGGSKATTPASTANSDVA 

TIPTDTPLKEENEGFVKVTDAPNKSE1SKHIEV 

QVAQETRNVSTGSAENEEKSEVQAIIESTPEL 

DMDKDLSGYKGSSTPTKGIENKAFDRNTESL 

FEELSSAGSGLIGDVDEGADLLGMGREVENLI 

LENTQLLETKWALNIVKNDLIAKVDELTCEK 

DVLQGELEAVKQAKLKLEEKNRELEEELRKA 

RAEAEDARQKAKDDDDSDIPTAQRKRFTRVE 

MARVLMERNQYKERLMELQEAVRWTEM1R 

ASRENPAMQEKKRSSIWQFFSRLFSSSSNTTK 

KPEPPVNLKYNAPTSHVTPS VKKRS STLSQLP 

GDKSKAFDFLSEETEASLASRREQKREQYRQ 

VKAHVQKEDGRVQAFGWSLPQKYKQVTNG 

QGENKMKNLPVPVYLRPLDEKDTSMKLWCA 

VGVNLSGGKTRDGGSVVGASVFYKDVAGLD 

TEGSKQRSASQSSLDKLDQELKEQQKELKNQ 

EELSSLVWICTSTHSATKVLIIDAVQPGNILDS 

FTVCNSHVLCIASVPGARETDYPAGEDLSESG 

QVDKASLCXjSMTSNSSAETDSLLGGITWGC 

SAEGVTGAATSPSTNGASPVMDKPPEMEAEN 

SEVDENSTTAEE\ATEATEGNAGSAEDTV\DIS 

QTGVYTEHVFTDPLGWQ1PEDLSPVYQSSND 

SDAYKDQISVLFNEQDLVREEAQKMSSLLPT 

MWLGAQNGCLYVHSSVAQWRKCLHSKLKD 

SILSrVHVKGIVLVALADGrLAIFHRGVDGQW 

DLSNYHLLDLGRPHHSmCMTVVHDKVWCG 

YRNKIYWQPKAMKIEKSFDAHPRKESQVRQ 

LAWVGDGVWVSIRLDSTLRLYHAHTYQHLQ 

DVDIEPYVSKMLGTGKLGFSFVR1TALMVSC 



247 



WO 01/57188 PCT/US01/03800 



SEQID 
NO: of 
nucl- 
eotide 
seq- 
uence 


SEQID 
NO: of 
peptide 
seq- 
uence 


Met 
hod 


SEQ 
ID NO: 
in 

USSN 
09/496 
914 


Predicted 

beginning 

nucleotide 

location 

correspondi 

ng to first 

amino acid 

residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue 
of peptide 
sequence 


Ammo acid sequence (A=Alanine OCysteine, 
D-Aspartic Acid, EKJlutamic Acid, 
^Phenylalanine, a<Hycine, H^Histidine, 
Hsoleucine, K=Lysine, L^Leucine, 
M-Methionine, N-Asparagine, P^Proline, 
QKJlutamine, R=Arginine, S=Serine, 
^Threonine, V=Valine, W=Tryptophan, 
Y=Tyrosine, X-Unknown, *-Stop codon, 
possible nucleotide deletion, \=possible 
nucleotide insertion 














ANKTSGVPGNRPGSVIRVYGDENSDKYTPGT 
FIPYCSMAHAQLCFHGHRDAVKFFVAVPGQV 
1SPQSSSSGTDLTGDKGRGHLHRSLWRRP 


805 


2155 


A 


6605 


469 


2602 


FGRLLWGTAFKSWKMKAPIPHLILLYATFTQ 

SLKWTKRGSADGCTDWSID3KKYQVLVGEP 

VRIKCALFY GYIRTNYSLAQSAGLSLMWYKS 

SGPGDFEEPIAFDGSRMSKEEDSIWFRPTLLQ 

DSGLYACVIRNSTYCMKVSISLTVGENDTGL 

CYNSKMXYFEKAELSKSKE1SCRDIEDFLLPT 

REPEILWYKECIRTKTWRPSrVFKRDTLLIREV 

REDDIGNYTCELKYGGFVVRRTTELTVTAPL 

TDKPPKLLYPMESKLTIQETQLGDSANLTCRA 

FFGYSGDVSPLIYWMKGEKFIEDLDENRVWE 

SDUKILKEHLGEQEVSISLIVDSVEEGDLGNYS 

CYYENGNGRRHASVIXHKRELMYTVELAGG 

LGAILLLLVCLVTIYKCYKIEIMLFYRNHFGA 

EELDGDNKDYDAYLSYTKVDPDQWNQETGE 

EERFALEILPDMLEKHYGYKLFIPDRDLIPTGT 

YIEDVARCVDQSKRLnVNITPNYVVRRGWSIF 

ELETRLRKMLVTGEIKV1LIECSELRGIMNYQE 

VEALKHTIKLLTVIKWHGPKCNKLNSKFWKR 

LQYEMPFKRIEPITHEQALDVSEQGPFGELQT 

VSAlSMAAA'iSTALArAHPDtRSTFHN'l^YHS 

QMRQKHYYRSYEYDVPPTGTLPLTSIGNQHT 

YCNIPMTLINGQRPQTKSSREQNPDEAHTNSA 

ILPLLPRETSISSVIW 


806 


2156 


A 


6614 


3 


1584 


NSARGGVGVRGARAMATVQEKAAALNLSAL 

HSPAHRPPGFSVAQKPFGATYVWSSIINTLQT 

QVEVKKRRHRLKRHNDCFVGSEAVDVIFSHL 

IQNKYFGDVDIPRAKVVRVCQALMDYKVFE 

AVPTKVFGKDKKPTFEDSSCSLYRFTnPNQD 

SQLGKENKLYSPARYADALFKSSDIRSASLED 

LWENLSLKPANSPHVNISTTLSPQVINEVWQE 

ETIGRLLQLVDLPLLDSLLKQQEAVPKIPQPK 

RQSTMVNSSNYLDRGILKAYSDSQEDEWLSA 

AIDCLEYLPDQMVVEISRSFPEQPDRTDLVKE 

LLFDAIGRYYSSREPLLNHLSDVHNGIAELLV 

NGKTEIALEATQLLLKLLDFQNREEFRRLLYF 

MAVAANPSEFKLQKESDNRMVVKRIFSKAIV 

DNKNLSKGKTDLLVLFLVMDHQKDVFKIPGT 

L\HKrVS\VK\LMAIQNGRDPNRDAGYIYCQRI 

DQRDYSNITEKTTIDELLYLLKTIDEDSKLSA 

ICEKKK\LLGQFYKCHPDITIEIIFGD 


807 


2157 


A 


6615 


4198 


2094 


FGIYGTTALETDELDSDRDPAIFSLCDFGAMR 

PQILUXALLTLGLAAQHQDKWCKM/VKML 

CPDRVDKKVSCQVLGLLQVPSVLPPDTETLD 

LSGNQLRSILASPLGFYTALRHLDLSTNEISFL 

QPGAFQAtTHLEHLSLAHNRLAMATALSAG 

GLGPLPRVTSLDLSGNSLYSGLLERLLGEAPS 

LHTLSLAENSLTRLTRHTFRDMPALEQLDLHS 

NVLMDIEDGAFEGLPRLTHLNLSRNSLTCISD 

FSLQQLRVLDLSCNSIEAFQTASVQPQAEFQLT 

WLDLRENKLLHFPDLAALPRLIYLNLSNNLIR 

LPTGPPQDSKGIHAPSEGWSALPLS\APSGNAS 

GRPLSQLLNLDLSYNEIELIPDSFLEHLTSLCFL 

NLSRNCLRTFEARRLGSLPCLMLLDLSHNALE 

TLELGARALG\SLRTLLLQGNALRDLPPYTFA 

NLASLQRLNLQGNRVSPCGGPDEPGP\SGCV\ 

AFSGITSLRSLSLVDNEIELLRAGAFLHTPLTE 
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LDLSSNPGLEVATGAJLGGLEASLEVLALQGN 

GLMVLQVDIJ'CFICLKRLNLAENRLSHLPAW 

TQAVSLEVLDLRNNSFSLLPGSAMGGLETSLR 

RLYLQGNPLSCCGNGWLAAQLHQGRVDVDA 

TQDLICRFSSQEEVSLSHVRPEDCEKGGLECNI 

NLIIILTFILVSAILLTTLAACCCVRRQKFNQQ 

YKA 


SOS 

■ 


2158 


A 


6619 


153 


1852 

• 


FKALSQYIYTOTHLEREAAFEVAI1-LRRMEEG 

ARHRNNTEKKHPGGGESDASPEAGSGGGGV 

ALKKEIGLVSACGIIVGNIIGSGIFVSPKGVLEN 

AGSVGLALTVWIVTGF1TWGALCYAELGVNI 

PKSGGDYFYVKDIFGGLAGFLRLWIAVLVrYP 

TNQAVIALTFSNYVLQPLFPTCFPPESGLKLLA 

AICLLLLTWVNCSSVRWATRVQDIFTAGKLL 

AlJVLIIIMGIVQICKGEYFWLEPKNAFEhfFQEP 

DIGLVALAFLQGSFAYGGWNFLNYWTEELV 

DP\YKNL\PRAMSIP\LVTFVYVFANV/ALYVT 

AMSPQEL\L AS\NAV AVTFGEKLLGVM A WIM 

PISVALSTFGGVNGSLFTSSRLFFAGAREGHLP 

SVLAM1HVKRCTPIPALLFTCISTLLMLVTSD 

MYTLIN Y VGF IN YLFYG VTV AGQIVLRWKKP 

DIPRPIKINLLFPIIYLLFWAFLLVFSLWSEPW 

CGIGLAIMLTGVPVYFLGVYWQHKPKCFSDFI 

ELLTLVSQKMCVVVYPEVERGSGTEEANED 

MEEQQQPMYQPTPTKDKDVAGQPQP 


809 


2159 


A 


6621 


1041 


223 


QDSRKMLPSTSVNSLVQGNGVLNSRDAARH 

TAGAKRYKYLRRLFRFRQMDFEFAAWQMLY 

LFTSPQRVYRNFHYRKQTKDQWARDDPAFL 

VLLSIWLCVST1GFGFVLDMGFFETIKLLLWV 

VLIDCVGVGLLIATLMWFISNKYLVKRQSRD 

YDVEWGYAFDVHLNAFYPLLV1LHFIQLFFIN 

HVILTDTFIGYLVGNTLWLVAVGYYTYVTFL 

GYSVGLLFFS\ALPFLKNTVILLYPFAPLEXYG 

LSLALGWNFTCITLCSFYICYRVIC 


810 


21(50 


A 


6623 


160 


822 


SPASGHCRLNGAAVAMFGCLVAGRLVQTAA 
QQVAEDKFVFDLPDYESINHVVVFMLGTIPFP 
EGMGG S VYFSYPD SNGMPV WQLLGFVTNGK. 
PSAIFKISGLKSGEGSQHPFGAMNIVRTPSVAQ 
IGISVELLDSMAQQTPVGNAAVSSVDSFTQFT 
QKMLDNFYNFASSFAVSQ/VPDDTQ/RPSEMF 
IPANVVLKWYENFQRRTSTEPSLLEN1IW1KJN 
F 


811 


2161 


A 


6627 


18 


3367 


LEGSLNTERAKYYLTTTIVIPHFTVTKVEDPEEG 
AAASISQEPSLADIKARIQDSDEPDLSQNS1TG 
EHSQLLDDGHKKARNAYLNNSNYEEGDEYF 
' DKNIAIJEEEMDTRPKVSSLLNRMANYTNLT 
QGAKEHEEAENITEGKKKPTKTPQMGTFMG 
VYLP CLQNIFG VILFLRLT WVVGTAGVLQAF 
AIVLICCCCTMLTAISMSAIATOGWPAGGSY 
FMTSRALGPEFGGAVGLCFYLGTTFAAAMYIL 
GAIEIFLVYIVPRAAIFHSDDALKESAAMLNN 
MRVYGTAFLVLMVLWFIGVRYVNKFASLFL 
ACTIVSILAIYAGAIKSSFAPPHFPVCMLGNRT 
LSSRHIDVCSKTKEINNMTVPSKLWGFFCNSS 
QFFNATCDEYFVHNNVTSIQGIPGLASGIITEN 
LWSNYLPKGEIIEKPSAKSSDVLGSLNHEYVL 
VDITTSFTLLVGIFFPSVTGIMAGSNRSGDLKD 
AQKSIPIGT1LAJLTTSFVYLSNYVLFGACIEGV 
VUU5KFGDAVKGNLVVGTLSWPSPWVTVIGS 
FFSTCGAGLQSLTGAPRLLQAIAKDNIIPFLRV 
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FGHSKANGEPTWALIXTAATAeT r nTT,TA.ST^L 

VAPILSMFKLMCYLFVNLACALQTLLRTPNW 

RPKf'RYYHWALSFMOMSlCLALMFISSWYYA 

IVAMVIAGMIYKYIEYQGAEKEWGDGIRGLS 

l^AARFALLRLEEGPPHTKNWRPQLLVLLKL 

DEDLHVKHPRLLTFASQLKAGKGLTrVGSVIV 

GNFLENYGEALAAEQTIKHLMEAEKVKGFCQ 

LWAAKLREGISHLIQSCGLGGMKHNTVVM 

GWPNGWRQSEDARAWKTFIGTVRVTTAAHL 

ALLVAKNISFFPSNVEQFSEGNIDVWWrVHDG 

GMLMLLPFLLK\QHKVWRKCSIRFF\TVAQLE 

DNSIQMKKDLATFLYHLRIEAEVEVVEMHDS 

DTSAYTYERTLMMEQRSQMT .RHMRLSKTER 

DREAQLVKDRNSMLRLTSIGSDEDEETETYQ 

EKVHMTWTKDKYMASRGQKAKSMEGFQDL 

LNMRPDQSNVRRMHTAVKLNEVIVNKSHEA 

KLVLLNMP GPPRNPEGDENYMEFLEVLTEGL 

ERVLLVRGGGSEVITIYS 


812 


2162 


A 


6628 


66 


640 


AVCTMSEMAELSELYEESSDLQMDVMPGEG 

DLPQMEVGSGSRELSLRPSRSGAQQLEEEGP 

MEEEEAQPMAAPEGKRSLANGPNAGEQPGQ 

VAGADFESEDEGEEFDDWEDDYDYPEEEQLS 

GAGYRVSAALEEADKMFLRTREPALDGGFQ 

MHYEKTPFDQLAFIEELF\SLMWNRLTEELG 

CDEIIDRE 


813 


2163 


A 


6630 


708 


1355 


AKMGAYKYIQELWRKKQSDVMRFLLRVRC 

WQ YRQL S ALHRAPRFTRPDKARRLG YKAKQ 

GY/VYIYIGFVFAVIYRIRVRRGGRKRPVPKG 

ATYGKPVHHGVNQLKFARSLQSVAEERAGR 

HCGALRVLNSYWVGEDSTYKFFEVILDPFHK 

AIRRNPDTQWITKPVHKHREMRGLTSAGRKS 

RGLGKGHKFHHTIGGSRRAAWRRRNTLQLH 

RYR 


814 


2164 


A 


6635 


201 


1705 


KGTEMNKSRWQSRRRHGRRSHQQNPWFRLR 

DSEDRSDSRAAQPAHDSGHGDDESPSTSSGT 

AGTSSVPELPGFYFDPEKKRYFRLLPGHNNCN 

PLTKESIRQKEMESKRLRIXQEEDRRKKIARM 

GFNASSMLRKSQLGFLNVTNYCHLAHELRLS 

CMERKKVQIRSMDPSALASDRFNLILADTNS 

DRLFTVNDVTVGGSKYG1INLQSLKTPTLKVF 

MHENLYFTNRKV\NSVCWASLNHLDSHtLLC 

LMGLAETPGCATLLPASLFVNSHPAGG)RPG\ 

MLCSFRIPGAWSCAWSLNIQANNCFSTGLSR 

RVLLTNWTGHRQSFGTNSDVLAQQFALMA 

PLLFNGCRSGEEFAIDLRCGNQGKGWKATRLF 

HDSAVTSVRILQDEQYLMASDMAGKIKLWD 

LRTTKC\ r RQYEGHVNEYAYLPLHVHEEEGIL 

VAVGQDCYTRIWSLHDARLLRTIPSPYPASKA 

DIPSVAFSSRLGGSRGAPGLLMAVGQDLYCY 

OIU 


815 


2165 


A 


6643 


659 


3282 


NKNILEVPSARTTRIMGDHLDLLLGWLMAG 

PVFGIPSCSFDGRIAFYRFCNLTQVPQVLNTTE 

RLLLSFNYIRTVTASSFPFLEQLQLLELGSQYT 

PLTIDKEAFRNLPNLRILDLGSSKIYFLHPDAF 

QGLFHLFELRLYFCGLSDAVLKDGYFRNLKA 

LTRLDLSKNQIRSLYLHPSFGKLNSLKSIDFSS 

NQffLVCEHELEPLQGKTLSFFSLAANSLYSR 

VSVDWGKCMNPFRNMVLEILDVSGNGWTV 

DITGNFSNAISKSQAFSL1LAHHIMGAGFGFHN 

IKDPDQNITAGLARSSVRHLDLSHGFVFSLNS 
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RVraTLKDLKVLNI^YNKJNKtA£)fiAFYGLD 

NLQVLNLSYNLLGELYSSNFYGLPKVAY1DL 

QKNHIAUQI^TFKFLEKLQTLDLRDNALTTIH 

FIPSIPDIFLSGNKLVTLPK1NLTANL1HLSENR 

LENLDILYFLLRVPHLQILILNQNRFSSCSGDQ 

TPSENPSLEQLFLGENMLQLAWETELCWDVF 

EGLSHLQVLYLNHNYLNSLPPGVFSHLTALR 

(H^SLNSNRLTVLSHNDLPANLEILDISRNQLL 

APNPDVFVSLSVLDITHNKF1CECELSTFINWL 

NHTNVTIAGPPADrx'Crvrm)SLSGVSLFSLSTE 

GCDEEE\OJCSLKFSLnVCTVTLTLFLMTILTV 

TKFRGFCFICYKTAQRLVFKDHPQGTEPDMY 

KYDAYLCFSSKDFTWVQNALLKHLDTQYSD 

QNRFNLCFEERDFVPGENRPVANIQDAIWNSR 

KIVCLVSRHFLRDGWCLEAFSYAQGRCLSDL 

NSALIMVWGSLSQYQLMKHQSIRGFVQKQQ 

YLRWPEDLQDVGWFLHKLSQQILKKEKEKK 

KDNNIPLQTVATIS 


m 


2166 


A 


6646 


1 


3811 


RDRACjVRPAGKQHAAAAFYDVGGDRPWDS 

GNTQLPPRNPVKANAMFGAGDEDDTDFLSPS 

GGARLASLFGLDQAAAGHGNEFFQYTAPKQP 

KKGQGTAATGNQATPKTAPATMSTPTILVAT 

AVHAYRYTNGQYVKQGKFGAAVLGNHTTR 

EYRILLYISQQQPVI-VARIHVNFELMVRPNNY 

STFYDDQRQNWSIMFESEKAAVEFNKQVCIA 

KCNSTSSLDAVLSQDLIVADGPAVEVGDSLE 

VAYTGWLFQNHVLGQVFDSTANKDKLLRLK 

LGSGKVIKGWEDGMLGMKKGGKRLL1VPPA 

CAYGSEGVIGWTQATDSILVFEVEVRRVK1A 

KDSGSDGHSVSSRDSAAPSPIPGADNLSADPV 

VSPPTSIPFKSGEPALRTKSNSLSEQLAINTSPD 

AVKAKIJSRMAKMGQPMLPILPPQLDSNDSEi 

EDVNTLQGGGQPWTPSVQPSLQPAHPALPQ 

MTSQAPQPSVTGLQAPSAALMQVSSLDSHSA 

VSGNAQSFQPYAGMQAYAYPQASAVTSQLQ 

PVRPLYPAPLSQPPHFQGSGDMASFLMTEAR 

QHNTEIRMAVSKVADKMDHLMTKVEELQKH 

SAGNSMLIPSMSVTMETSMIMSNIQRIIQENER 

LKQE1LEKSNRIEEQNDKISEL1ERNQRYVEQS 

NLMMEKRNNSLQTATENTQARVLHAEQEKA 

KVTEELAAATAQVSHLQLKMTAHQKKETEL 

QMQLTESLKETDLLRGQLTKVQAKLSELQET 

SEQAQSECFKSEKQNRKQLELKVTSLEEELTDL 

RVEKESLEKNLSERKKKSAQERSQAEEEIDEI 

RJCSYQEELDKLRQLLKKTRVSTDQAAAEQLS 

LVQAELQTQWEAKCEHLLASAKDEHLQQYQ 

EVCAQRDAYQQKLVQLQEKSVCFAVCLALQA 

QITALTKQNEQHIKELEKNKSQMSGVEAAAS 

DPSEKVICKIMNQVFQSLRREFELEESYNGRTI 

LGTIMNTKMVTLQLLNQQEQEKEESSSEEEE 

EKAEERPRRPSQEQSASASSGQPQAPLNRERP 

ESPMVPSEQWEEAVPLPPQALTTSQDGHRR 

KGDSEAEALSEIKDGSLPPELSC1PSHRVLGPP 

TSIPPEPLGPVSMDSECEESLAASPMAAKVPDN 

PSGKWCVREVAPDGPLQESSTRLSLTSVDPEE 

GDPLALGPESPGEPQPPQLKKDDVTSSTGPHK 

ELSSTEAGSTVAGAALRPSHHSQRSSLSGDEE 

DELFKGATLKALRPKAQPEEEDEDEVSMKGR 

PPPTPLFGDDDDDDDEDWLG 


817 


2167 


A 


6649 


63 


1673 


FFRSSSDWGSPIRQYE/HSTPAHOGPVMGLteG 
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KS^AKNSQLEIVLVGKTGAGKSATGNSILGRK 

vrasGTA^siTKkcE^m^ 

TPGIFDTEVPNAETSKEURCILLTSPGPHALLL 

VWLGRYTEEEHKATEKILKMFGERARSFMTL 

IFTRKDDLGDTNLHDYLREAPEDIQDLMDIFG 

DRYCALNNKATGAEQEAQRAQLLGLIQRW 

RENKEGCYTNRMYQRAEEEIQKQTQAMQEL 

HRVELEREKARIREEYEEKIRKLEDKVEQEKR 

KKQMEKKLAEQEAHYAVRQQRARTEVESKD 

GILELIMTALQIASFILLRLFAED 


818 


2168 


A 


6660 


357 


1890 


APSGSWTRVVLTLDPCSLRSRSPRSLLDPGMP 

GISARGLSHEGRKQLAVNLTRVLALYRSILDA 

YIIEFFMDNIWDTLPCSWQEALDGLKPPQLA 

TMLLGMPGEGEWRYRSVWPLTLLALKSTA 

CALAFTRMPGFQTPSEFLENPSQSSRLTAPFR 

KHVRPKKQHEIRRLGELVKKLSDFT/GLHPGC 

RRGLRPG\HLSRFMALGLGLMVKSIEGDQRL 

VERAQRLDQELLQALEKEEKRNPQVVQTSPR 

HSPHHVVRWVDPTALCEELLLPLENPCQGRA 

RLLLTGLHACG\DLSVALLRHFSCCPEWALA 

SVGCCYMKLSDPGGYPLSQWVAGLPGYELP 

YRLREGACHALEEYAERLQKAGPGLRTHCY 

RAALETORRARPELRRPGVQGIPRVHELKDBE 

YVQRGLQRVGLDPQLPLNLAALQAHLAQEN 

RWAFFSLALLLAPLVETLILLDRLLYLQEQA 

LSP\GFHAELLPIFSPELSPRNLVLVATKMPLG 

QALSVLETEDS 


819 


2169 


A 


6661 


65 


2686 


SGSGHCLAEAASMGPWGWKLRWTVALLLA 

AAGTAVGDRCERNEFQCQDGKCrSYKWVCD 

GSAECQDGSDESQETCLSVTCKSGDFSCGGR 

VNRCIPQFWRCDGQVDCDNGSDEQGCPPKTC 

SQDEFRCHDGKCISRQFVCDSDRDCLDGSDE 

ASCPVLTCGPASFQCNSSTCIPQLWACDNDPD 

CEDGSDEWPQRCRGLYVFQGDSSPCSAFEFH 

CLSGECIHSSWRCDGGPDCKDKSDEENCAVA 

TCRPDEFQCSDGNCIHGSRQCDREYDCKDMS 

DEVGCVNVTLCEGPNKFKCHSGECITLDKVC 

NMARDCRDWSDEPDCECGTNECLDNNGGCS 

IIVCNDLKIGYECLCPDGFQLVAQRRCEDIDE 

CQDPDTCSQLCVNLEGOYKCQCEEGFQLDPH 

TKACKAVGSIAYLFFTNRHEVRKMTLDRSEY 

TSUPNLRNVVALDTEVASNRIYWSDLSQRMI 

CSTQLDRAHGVSSYDTVISRDIQAPDGLAVD 

WIHSNIYWTDSVLGTVSVADTTCGVKRKTLFR 

ENGSKPRAIWDPVHGFMYWTDWGTPAKIK 

KGGLNGVDIYSLVTEN1QWPNGITLDLLSGRL 

YWVDSKLHSISSEDVNGGNRKTILEDEOJLAH 

PFSLAVFEDKVFWTDHNEAIFSANRLTGSDV 

NLLAENLLSPEDMYLFHNLTQPRGVNWCERT 

TLSNGGCQYLCLPAPQINPHSPKFTCACPDGM 

LLAR\DMRSCLTEG\EAAVATQETSTVRLKVS 

STAVRTQHTTTRPVPDTSRLPGATPGLTTVEI 

VTMSHQALGDVAGlRGrAEKKPSSVRALSIVL 

PIV\LLVFLCLGVFLLWKNWRLKNINSINFDNP 

VYQKTTEDEVHICHNQDGYSYPSRQMVSLED 

DVA 


820 


2170 


A 


6666 


17 


4146 


ERGISSQIKGMKSGSGGGSPTSLWGLLFLSAA 
LSLWPTSGEICGPGIDIRNDYQQLKRLENCTVI 
EGYLHDLLISKAEDYRSYRFPKLTVITEYLLLF 
RVAGLESLGDLFPNLTVIRGWKLFYNYALVIF 
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EMTNLKD1 GLYNLRNITRGXAIRIEKNADLCYL 

STVDWSLILDAVSNNYIVGNKPPKECGDLCP 

GTMEEKJ'MCEKTTINNEYNYRCWTTNRCQK 

MCPSTCGKRACTENNECCHPECLGSCSAPDN 

DTACYACRHYYYAGVCVPACPPNTYRFEGW 

RCVDRDFCAN1LSAESSDSEGFVIHDGECMQB 

CPSGFIRNGSQSMYCIPCEGPCPKVCEEEKKT 

KaiDSVTSAQMLQGCTlFlCGNLLlNIRRGNNIA 

SELENFMGLIEWTGYVKIRHSHALVSLSFLK 

NLRLILGEEQLEGNYSFYVLDNQNLQQLWD 

WDHRNLTIKAGKMYFAFNPKLCVSEIYRMEE 

\TTGTKGRQSKGDnmNNGERASCESDVLHF 

TSTTTSKNRinTWHRYRPPDYRDLISFTVTVK 

EAPFKNVTEYDGQDACGSNSWNMVDVDLPP 

NKDVEPG1LLHGLKPWTQYAVYVKAVTLTM 

VENDHIRGAKSE1LY1RTNASVPSIPLDVLSAS 

NSSSQLIVKWNPPSLPNGNLSYYIVRWQRQP 

QDGYLYRHNYCSKDKEPIRKYADGTIDIEEVT 

ENPKTEVCGGEKGPCCACPKTEAEKQAEKEE 

AEYRKVFENFLHNSIFVPRPERKRRDVMQVA 

NTTMSSRSRNTTAADTYNJTDPEELETEYPFF 

ESRVDNKERTVISNLRPFTLYRIDIHSCNHEAE 

KLGCSASNFVFARTMPAEGADDIPGPVTWEP 

RPENSIFLKWPEPENPNGLILMYEIKYGSQVE 

DQRECV SRQEYRKYGG AKLNRLNPGNYTARI 

QATSLSGNGSWTDPVFFYVQAKRYENFIHLII 

ALPVAVLLIVGGLVIMLYVFHRKRrWSRLGN 

GVLYASVNPEYFSAADVYVPDEWEVAREK1T 

MSRELGQGSFGMVYEGVAKGWKDEPETRV 

AIKTVNEAASMRERIEFLNEASVMKEFNCHH 

VVRLLGWSQGQPTLVIMELMTRGDLKSYLR 

SLRPEMENNPVLAPPSLSKMIQMAGEIADGM 

AYLNANKFVHRDLAARNCMVAEDFTVKIGD 

FGMTRDIYETDYYRKGGKGLLPVRWMSPESL 

KDGVFTTYSDVWSFGWLWEIATLAEQPYQ 

GLSNEQVLRFVXMEGGLLDKPDNCPDMLFEL 

MRMCWQYNPKMRPSFLEIISSIKEEMEPGFRE 

VSFYYSEENKLPEPEELDLEPENMESVPLDPS 

ASSSSLPLPDRHSGHKAENGPGPGVLVLRASF 

DERQPYAHMNGGRKNERALPLPQSSTC \ 


821 


2171 


A 


6691 


106 


825 


GRVLFRGCGVGHKGQVLMGTFILAQDWLSE 

SNHVFCVSSMLRLQKRLASSVLRCGKKKVW 

LDPNETNEIANANSRQQBRKLIKDGLIIRKPVT 

VHSRARCRKNTLARRKGRHMGIGKRKGTAN 

ARMPEKVTWMRRMRDJUUXRRYRES/KRYR 

ESKJKIDRHN1YHSLYUCVKGNVFKNKRILMEH 

IHKLKADKARKKLLADQAEARRSKTKEARK 

RREERLQAKKEEIIKTLSKEEETKK 


822 


2172 


A 


6715 


772 


21 


DFRPGLLLPRKKKMFGFHKPKMYRSIEGC\CI 

SGAKSSSS\RFTDSKRYEK^J)FQ\SCFGLHETR\ 

SGDI\CNA\CVLL\LKRWKBCLPAGSKK\NWNH 

VVDARAGPSVLKTTLKPKKVKTL\SGNRIK\ST 

Q1SKLQKEFKR\HNSDAHSTTS\SASP\AQSPLF 

TVNQFRWTGSDTGVGFPGSNRNHFVFSFLDU 

TYWKRQKICCGIXIYKGRFGEVLIDTHLFKPCC 

SNKKA\AAEKPEE QGPEPLPI STQE WVTEVFM 


823 


2173 


A 


6727 


3 


4063 


PYLATLQLX)SSLLrPPKYQTPPAAAQGQATPG 
NAGPLAPNGSAAPPAGSAFNPTSNSSSTNPAA 
SSSASGSSVPPVSSSASAPGISQISTTSSSGFSGS 
VGGQNPSTGGISADRTQGNIGCGGDTDPGQS 
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TPVTLNSAAPASNSGAGVLPSPATPRFSYFTP 

RTPRTPRTPRGGGTASGQGSVKYDSTDQGSP 

ASTPSTTRPLNSVEPATMQPIPEAHSLYVTLIL 

SDSVMrnFKDROTDSCQCACNMNIKGADVG 

LYIPDSSNEDQYRCTCGFSA1MNRKLGYNSGL 

FLBDELDIFGKNSDIGQAAERRLM\MCQSTFL 

PQVEGTKKPQEPPISLLLLIQNQHTQPFASLN 

FLDYISSNNRQTLPCVSWSYDRVQADKNDY 

WTECFNALEQGRQYVDNPTGGKVDEALVRS 

ATVHSWPHSNVLDISMLSSQDVVRMLLSLQP 

FLQDAIQKKRTGRTWENIQHVQGPLTWQQFH 

KMAGRGTY GSEESPEPLPIPTLLVGYDKDFLT 

ISPFSLPFWERLLLDPYGGHRDVAYTVVCPEN 

EALLEGAKTFFRDLSAVYEMCRLGQHKPICK 

VLRDGIMRVGKTVAQKLTDELVSEWFNQPW 

SGEENDNHSRLKLYAQVCRHHLAPYLATLQL 

DSSLL1PPKYQTPPAAAQGQATPGNAGPLAPN 

GSAAPPAGSAFNPTSNSSSTNPAASSSASGSSV 

PPVSSSASAPGISQISTTSSSGFSGSVGGQNPST 

GGISADRTQGNIGCGGDTDPGQSSSQPSQDG 

QESVTERERJGIPTEPDSADSHAHPPAVVIYM 

VDPFTYAAEEDSTSGNFWLLSLMRCYTEMLD 

NLPEHMRNSFILQIVPCQY MLQTMKDEQ VF Y 

IQYLKSMAFSVYCQCRRPLPTQIHiKSLTGFGP 

MSIEMTLKNPERPSPIQLYSPPFILAPKDKQT 

ELGETFGEASQKYNVLFVGYCLSHDQRWLL 

ASCTDLHGELLETCVVNIALPNRSRRSKVSAR 

KIGLQKLWEWCIGIVQMTSLPWRWIGRLGR 

LGHGELKDWSILLGECSLQTTSKKLKDVCRM 

CGISAADSPSILSACLVAMEPQGSFWMPDAV 

TMGSVFGRSTALNMQSSQLNTPQDASCTHIL 

VFPTSSTIQVAPANYPNEDGFSPNNDDMFVDL 

PFPDDMDNDIGILMTGNLHSSPNS SP VPSPGSP 

SGIGVGSHFQHSRSQGERLLSREAPEELKQQP 

LALGYFVSTAKAENLPQWFWSSCPQAQN\QC 

PLFLKASLHHMSVAQTDELLPARNSQRVPHP 

LDSKTTSDVLRFVLEQYNALSWLTCNPATQD 

RTSCLPVIIFWLTQLYNAIMNIL 


824 


2174 


A 


6732 


2440 


365 


VEEGLGRRRTPPGGRRGPVTPARPGPDSVRR 

RLLPPSSAAAFSSHRHNLLCSRRRGGGGGGG 

GGGGGTDCRPGITGPTAATSPSGEPGNAASAP 

LSLLSPFPGQTTYQHPGVAEPSAYGGRDVAC 

ASLVFGRJLQHRG GDRKRGLLGRSSGDAASD 

QPFRCRSGSTAGRLVKQMDFTEAYADTCSTV 

GLAAREGNVKVLRKLLKKGRSVDVADNRG 

WMPIHEAAYHNSVECLQMLINADSSENYIKM 

KTFEGFCALHLAASQGHWKIVQILLEAGADP 

NATTLEETTPLFLAVENGQIDVLRLLLQHGAN 

VNGSHSMCGWNSLIIQASFQENAEIIKLLLRK 

GANKECQDDFG1TPLFVAAQYG\KLESL\SILIS 

SG\ANVNCQALDKATPLFIAAQEGHTKCVELL 

LSSGADPDLYCNEDSWQLPIHAAAQMGHTKJ 

LDLLIPLTNRACDTGLNKVSPVYSAVFGGHE 

DCLEILLRNGYSPDAQACLVFGFSSPVCMAFQ 

KDCEFFGIVNILLKYGAQINELHLAYCLKYEK 

FSIFRYELRKGCSLGPWNHIYEFVNHAIKAQA 

KYKEWLPHLLVAGFDPLILLCNSWIDSVSIDT 

LIFTLEFXNWKTLAPAVERMLSARASNAWIL 

QQHUTVPSLTHLCRLEIRSSLKSERLRSDSYIS 
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QLPLPRSLHNYLLYEDVLRMYEVPELAA1QD 
G 


825 


2175 


A 


6735 


277 


1252 


RIMGLFDRGVQMLLTTVGAFAAFSLMTIAVG 

TDYWLYSRGVCKTKSVSENETSKKNEEVMT 

HSGLWRTCCLEGNFKGLCKQIDHFPEDADYE 

ADTAEYFLRAVRASSIFPILSVILLFMGGLCIA 

ASEFYKTRHNIlLSAGIFFVSAGLSNIIGnVYIS 

ANAGDPSKSDSKKNS YS YG WSF YFGALSF11A 

EMVGVLAVHMFIDRHKQLRATARA\TDYLQ 

ASATnUPSYRYRYQRRSRSSSRSTEPSHSRDA 

SPVGIKGFNTLPSTEISMYTLSRDPLKAATTPT 

ATYNSDRDNSFLQVHNCIQKENKDSLHSNTA 

NRRTTPV 


826 


2176 


A 


6744 


3 


5177 


SDDLRTGLFQDVQDAESLKLPGVYEVLFYNE 

TEDCPGMMLWRYPEPRGLTLVRTTPVPFNTT 

EDPDISTADLGDVLQDPCSLEYWDELQKVFV 

AFREFNLSESKVCELQLPDINLVNDQKKLVSS 

DLWRIVT.NSSQNGADDQSvSASESGSQSTCDPL 

VTPTALAACTRVDSCFTPWFVPSLCVSFQFAH 

LEFHLCHHLDQLGTAAPQYLQPFVSDRNMPS 

ELEYMIVSFREPHMYLRQWNNGSVCQEIQFL 

AQADCKLLECRNVTMQSWKPFSIFGQMAVS 

SDVVEKLLDCTVTVDSVFVNLGQHVVHSLNT 

AIQAWQQNKCPEVEELVFSHFVICNDTQETL 

RFGQVDTDENILLASLHSHQYSWRSHKSPQL 

LHICIEGWGNWRWSEPFSVDHAGTFIRTIQYR 

GRTASLIIKVQQLNGVQKQIIICGRQIICSYLSQ 

SDELKVVQHYIGQDGQAVVREHFDCLTAKQK 

LPSYILENNELTELCVKAKGDEDWSRDVCLE 

SKAPEYSIVIQVPSSNSSirYVWCTVLTLEPNS 

QVQQRMTVFSPLFIMRSHLPDPIIIHLEKRSLGL 

SETQOPGKGQEKPLQN1EPDLVHHLTFQAREE 

YDPSDCAVPISTSLIKQIATKVHPGGTVNQILD 

EFYGPEKSLQPIWPYNKKDSDRNEQLSQWDS 

PMRVKLS1WKPYVRTLLIELLPWALLINESKW 

DLWLFEGEK1VLQVPAGKLIPPNFQEAFQIGIY 

WANTNTVHKSVAIKLVHNLTSPKWKDGGNG 

EWTLDEEAFVDTEIRLGAFPGHQKLCQFCIS 

SMVQQGIQIIQIEDKTniNNTPYQIFYKPQLSV 

CNPHSGKEYFRVPDSATFSICPGGEQPAMKSS 

SLPCWDLMPDISQSVLDASLLQKQIMLGFSPA 

PGADSSQCWSLPAIVRPEFPRQSVAVPLGNFR 

ENGFCTRAIVLTYQEHLGVTYLTLSEDPSPRV 

IIHNRCPVKMLIKENIKD1FKFEVYCKK1PSECS 

IHHELYHQIS SYPDCKTKDLLPSLLLRVEPLDE 

VTTEWSDAIDINSQGTQVVFLTGFGYVYVDV 

VHQCGTVFrrVAPEGKAGPILTNTORAPEKIV 

TF/KMFITQLSLAVFDDLTHHKASAELLRLTL 

DNIFLCVAPGAGPLPGEEPVAALFELYCVEIC 

CGDLQLDNQLYNKSNFHFAVLVCQGEKAEPI 

QCSKMQSLLISNKELEEYKEKCFIKLCITLNEG 

KSILCDINEFSFELKPARLYVEDTFVYYIKTLF 

DTYLPNSRLAGHSTHLSGGKQVLPMQVTQH 

ARALVNPVKLRKLVIQPVNLLVSIHASLKLYI 

ASDHTPLSFSVFERGPIFrrARQLVHALAMHY 

AAGALFRAGWWGSLDILGSPASLVRSIGNG 

VADFFRLPYEGLTRGPGAFVSGVSRGTTSFVK 

HISKGTLTSITNLATSLARNMDRLSLDEEHYN 

RQEEWRRQLPESLGEGLRQGLSRLG1SLLGAI 

AGIVDQPMQNFQKTSEAQASAGHKAKGVISG 
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VOKQMQVFTKPIQ(IAA^ySQTGYG!LHGA 

GLSQLPKQimQPSD\VHAlX}A™^ 

KMLQSLGRPEVHMALDWLVRGSGQEHEGC 

LLLTSEVLFWSVSEDTQQQAFPVTEIDCAQD 

SKQNNLLTVQLKQPRVACDVEVDGVRERLSE • 

QQYNRLVDYITKTSCHLAPSCSSMQIPCPWA 

AEPPPSTVKTYHYLVDPHFAQVFLSKFTMVK 

NKALRKGFP 


827 


2177 


A 


6748 


2 


1662 


FVGAPRRGNPFGSPGNPGRHQGPCHRPRGTK 

ASGVSPTLWRPQAAATGLEMPSSGRALLDSP 

LDSGSLTSLDSSVFCSEGEGEPLALGDCFTVN 

VGGSRFVLSQQALSCFPHTRLGKLAVWASY 

RRPGALAAVPSPLELCDDANPVDNEYFFDRS 

SQAFRYVLHYYRTGRLHVMEQLCALSFLQEI 

QYWGIDELSIDSCCRDRYFRRKELSETLDFKK 

DTEDQESQHESEQDFSQGPCPTVRQKLWNEL 

EKPGSSTAARIFGVISIIFVGVSIINMALMSAEL 

SWLDLQLLEILEYVCISWTTGEFVLRFLCVRD 

RCRFLRKVPNHDLLAILPFYITLLVESLSG\SQT 

TQEUENVGAHCPG(XRLLRAL\RMLKAWGR 

HSTGLRSLGMTITQCYEEVGLLLLFLSVGISIF 

SrVEYFAEQSIPDTTFTSVPCAWWWATTSMT 

TVGYGD1RPDTTTGKIVAFMC1LSG1LVLALPI 

AIINDRFSACYFTLKLKEAAVRQREALKKLTK 

NIATDSYISVNLRDVYARSMEMLRLKGRER 

ASTRSSGGDDFWF 


828 


2178 


A 


6786 


5672 


1360 


GTHPASSGPVPLPPAAVSAATREELGEPVPFV 

TASSGFQSMHSSNPKVRSSPSGNTQSSPKSKQ 

EVMVRPPTVMSPSGNPQLDSKFSNQGKQGGS 

ASQSQPSPCDSKSGGHTPKALPGPGOSMGLK 

NGAGNGAKGKGKRERSISADSFDQRDPGTPN 

DDSDIKECNSADHIKSQDSQHTPHSMTPSNAT 

APRSSTPPHGQTTATEPTPAQKTPAKWYVFS 

TEMANKAAEAVLKGQVEHVSFHIQNISNNK 

TERSTAPLNTQISALRNDPKPLPQQPPAPANQ 

DQNS SQNTRLQPTPPIPAPAPKPAAPPRPLDRE 

SPGVENKLIPSVGSPASSTPLPPDGTGPNSTPN 

NRAVTYVSQGSNSSSADPKAPPPPPVSSOEPPT 

LGENPDGLSQEQLEHRERSLQTLRDIQRMLFP 

DEKEFTGAQSGGPQQNPGVLDGPQKKPEGPI 

QAMMAQSQSLGKGPGPRTDVGAPFGPQGHR 

DVPFSPDEMVPPSMNSQSGTIGPDHLDHMTP 

EQIAWLKLQQEFYEEKRRKPEQWVQQCSLQ 

DMMVHQHGPRGVVRGPPPPYQMTPSEGWAP 

GGTEPFSDGINMPHSLPPRGMAPHPNMPGSQ 

MRLPGFAGMINSEMEGPNVPNPASRPGLSGV 

SWPDDVPKIPDGRNFPPGQGIFSGPGRGERFP 

NPQGLSEEMFQQQLAEKQLGLPPGMAMEGIR 

PSMEMNRMIPGSQRHMEPGNNPIFPRIPVEGP 

l^rhtfSjur rJvulPPQMGPGRELEFGMVPSGM 

KGDVNLNVNMGSNSQMIPQKMREAGAGPEE 

MLKLRPGGSDMLPAQQKMVPLPFGEHPQQE 

YGMGPRPFLPMSQGPGSNSGLRNLREPIGPDQ 

RTNSRLSHMPPLPLNPSSNFTSLNTAPPVQRG 

LGRKPLDISVAGSQVHSPGINPLKSPTMHQVQ 

SPMLGSPSGNLKSPQTPSQLAGMLAGPAAAA 

SIKSPPVLGSAAASPVHLKSPSLPAPSPGWTSS 

PEPPLQSPGIPPNHKAPLTMASPAMLGNVESG 

GPPPPTASQPASVNIPG\SLPSSTPYTMPPEPTL 

SQNPLSIM\MSR\MSKFAM\PS\SNPGYNHDAI 
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KTVASSDDDSPPARSPNLPSMNNMFGMGINT 

QNPRISGPNPWPMPTLSPMGMTQPLSHSNQ 

MPSPNAVGPN1PPHGVPMGPGLMSHNPIMGH 

GSQEPPMVPQGRMGFPQGFPPVQSPPQQVFFP 

HNGPSGGQGSFPGGMGFPGEGPLGRPSNLPQ 

SSADAALCKPGGPGGPDSFTVLGNSMPSVFT 

DPDLQEVIRPGATGIPEFDLSRIIPSEKPSQTLQ 

YFPRGEVPGRKQPQGPGPGFSHMQGMMGEQ 

APRMGLALPGMGGPGPVGTPDIPLGTAPSMP 

GHNPMRPPAFLQQGMMGPHHRMMSPAQST 

MPGQPTLMSNPAAAVGMDPGKDRGPAGLYT 

HPGPVG SPGMMMSMQGMMGPVNRTS 


829 


2179 


A 


6797 


433 


3 


ASFFNFSICICKIILE VGFP VGHPAHDD VGGRH 

GPGGR/GSRSPRSLQCAPGGGRRSGCPAGSSP 

ASTCPPSPGGSGADRFGPSPPPPSREAAPTAG 

AAASSTSSGASCPPVPASSRWGVRSRTRSGSG 

GEREPRDRPSERPRLV 


830 


2180 


A 


6800 


3 


1911 


LPEKAFGPRTPRAPRRRRRRUXSPPPRFPPPL 

DRJEPRAPGPWLCPSRAGTAQDPARIRERRGR 

VAGGAAGPAMELRARGWWLLCAAAALVAC 

ARGDPASKSRSCGEVRQIYGAKGFSSS\DVPQ 

AEISGEHLIUCPQGYTCCTSEMEENLANRSHA 

ELETALRD S SRVLQ AML ATQLRSFDDHFQHL 

LNDSERTLQATFPGAFGELYTQNARAFRDLY 

SELRLYYRGANLHLEETXAEFWARLLERLFK 

QLHPQLLLPDDYLDCLGKQAEALRPF\GEAP\ 

RELRLRAmA\FVAAR\SFVQGLGVAS\DWR 

KVAQVPLXj\PEC\SRAVIEAGSYC/ALHCVGVP 

GARPCPDYCRNVLKGCLANQADLDAEWRKL 

LDSMVLITDKFWGTSGVESVIGSVHTWLAEA 

rNALQDNRDTLTAKVIQGCGNPKVNPQGPGP 

EEKRRRGKLAPRERPPSGTLEKLVSEAKAQL 

RDVQDFV\aSLPGTLCSEKMALSTASDDRCWN 

GMARGRYLPEVMGDGLANQINNPEVEVDIT 

KPDMTIRQQIMQIJCIMTNRLRSAYNGNDVDF 

QDASDDGSGSGSGDGCLDDLCGRKVSRKSSS 

SRTPLTHALPGLSEQEGQKTSAASCPQPPTFL 

LPLLLFLALTVARPRWR 


831 


2181 


A 


6808 


2 


1522 


ASRHGMTPGALLMLLGALGPPLAPGVRGSEA 

EGRLREKLFSGYDSSVRPAREVGDRVRVSVG 

LILAQLISLNEKDEEMSTKVYLDLEWTDYRLS 

WDPAEHDGIDSLRTTAESVWLPDVVLLNNND 

GNFDVALDISVWSSDGSVRWQPPGIYRSSCS 

IQVTYFPFDWQNCTMVFSSYSYDSSEVSLQT 

GLGPDGQGHQEIHIHEGTF1ENGQWENIHKPS 

RLIQPPGDPRGGREGQRQEVIFYLIIRRKPLFY 

LVNVIAPCELITLLAIFVFYLPPDAGEKMGLSIF 

ALLTLTVFLLLLADKVPETSLSVPUIKYLMFT 

MVLVTFSVILSVVVLNLHHRSPHTHQMPLWV 

RQ1FIIIKLPLYLRLKRPKPERDLMPEPPHCSSP 

GSGWGRGTDEYFBRKPPSDFLFPKPNRFQPEL 

SAPDLRRFIDGPNRAVALLPELREVVSSISYIA 

RQLQEQEDHDALKEDWQFVAMWDRLFLW 

TFIIFTSVGTLWIFLDATYHLPPPDPFP 


832 


2182 


A 


6824 


71 


1079 


ETMAKNPPENCEDCHILNAEAFKSKKICKSLK 
ICGLVFGILALTLIVLFWGSKHFWPEVPKKAY 
DMEHTFYSNGEKKKIYMEIDPVTRTEIFRSGN 
GTDETLEVHDFKNGYTGIYFVGLQKCFIKTQI 
KV1PEFSEPEEEIDENEE1TTTFFEQ S VIWVPAE 
KPIENRDFLKNSKILEICDNVTMYW\INPTL\IS 
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Amino acid sequence (A»AIanine OCysteine, 
D=Aspartic Acid, E=Glutamic Acid, 
F=Phenylalanine, G=Glycine, H=Histidine, 
Hsoleucine, K^Lysine, L^Leucine, 
ivi o= Mci[iionuie, n^/vjparagine, r^rroline, 
Q=GIutamine, R=Arginine, S=»Serine, 
^Threonine, V«Valine, W^Tryptophan, 
Y«Tyrosine, X=»Unknown, *=Stop codon, 
/possible nucleotide deletion, \=possible 
nucleotide insertion 














u i r AM^LttfllNl* ArULVoixODrEEEGEDLHFP 
ANEKKGffiQNEQWWPQVKVEKTRHARQAS 
EEELPIND YTENGIKt'DPMLDERG YCCI YCRR 
GNRYCRRVCEPLLGYYPYPYCYQGGRVICRV 
IMPCNWWVARMLGRV 


833 


2183 


A 


6846 


116 


602 


EAEGEQVCGAKCCGDAPHVENREEETARIGP 

GVMESKEERALNNLIVENVNQENDEKDEKE 

QVANKGEPLALPLNVSEYCVPRGNRRRFRVR 

QPILQYRWDIMHRLGEPQARMREENMERIGE 

EVRQLMEKLREKQLSHSLRAVSTDPPHHDHH 

DEFC\LMP 


834 


2184 


A 


6851 


3 


2024 


PNGVA1XHLPGAAVIPNTNYMFQDALGGRSR 

GSREESPAPSRAPASASLWRRLWVEAKMAA 

HAAAAAQAAAAQAAHAEAADSWYLALLGF 

AEHFRTSSPPKIRLCVHCLQAVFPFKPPQRIEA 

RTHLQLGSVLYHHTKNSEQARSHLEKAWLIS 

QQIPQFEDYKFEAASLLSELYCQENSVDAAKP 

LLRKAIQISQQTPYWHCRLLFQT . AQLHTLEKD 

LVSACDUGVGAEYARWGSEYTRALFLLSK 

GMLLLMERKLQEVHPLLTLCGQIVEmVQGN 

PIQKESLRVFFLVLQVTHYLDAGQVKSVKPC 

LKQLQQCIQHSTLHDDEILPSNPADLFHWLP 

KEHMCVL\TLVTVMHSMQAGYLEKAQKYT 

DKALMQLEKLKMLDCSPILSSFQVELLEHI1M 

CRLVTGHKATALQEISQVCQLCQQSPRLFSN 

HAAQLHTLLGLYCVSVNCMDNAEAQFTTAL 

RLTNHQELWAFIVTNLASVYIREGNRHQEVV\ 

LYSLLERINPDHSFPVSSHCLRAAAFYVRGLF 

SFFQGRY'NEAKRFLRETLKMSNAEDLNRLTA 

CSLVLLGHIFYVLGNHRESNNMVVPAMQLAS 

KJDPDMSVQL WSS ALLRDLNKACGNAMDAHE 

AAQMHQNFSQQLLQDHIEACSLPEHNLHWT 

i^rFrv^ryAQNGPNTSLASLL 


835 


2185 


A 


6855 


334 


1268 


PTRRPILPLTSPKAISVPSPLQGKQHTLVKSCL 

SVSGIGGFLVSLSSRMKLQTLAVSVTALKFWS 

AYVPCQTQDRDALRLTLEQIDLIRRMCASYSE 

LZLv I oAKAJLNUI ^KJL^CLIGVEGGHSLDNS 

LSILRTFYMLGVRYLTLTHTCNTPWAESSAK 

GVHSFVmSGLTDFGEKWAEMNRLGMMV 

DLSHVSDAVARRAiEVSQAPVIFSHSAARGV 

CNSARNVPDDILQLLEEERWAFVMVSLFHGE 

LIQWQPIRPMCSTVADHFDHIKAWGSKFIGI 

GGDYDGAGKYRKKTTCKAPWRTSSRMSS 


836 


2186 


A 


6862 


315 


ii 


PPRSRPSCWRKKVGPGRPWWWGGTGPPGQG 
RPEIRLLPLPMTGACGAVAASRTGSSGPG/SSL 
PNGHGGKGSGLANGLAGNrAGHLGLGSSFGT 
GPGSGRPPP 


837 


2187 


A 


6863 


2 


1615 


VLRGQRGPAGGLAEERRRGRNEWRIHDVTT 

APFPGLVQRRSRLLIVSQVRYFLKNKVSPDLC 

NEDGLTALHQCCIDNFEEIYKLLLSHGANVN 

AKDNELWTPLHAAATCGHINLVKILVQYGA 

DLLAVNSDGNMPYDLCEDEPTLDVIETCMAY 

QGH'QEKINEMRVAPEQQMIADIHCMIAAGQ 

DLDWIDAQGATLLHIAGANGYLRAAELLLDH 

GVRVDVKDWDGWEPLHAAAFWGQMQMAE 

LLVSHGAN\LNARTSMDEMPIDLCEEEEFKVL 

LLELK\HKHDVIMKSQLRHKSSLSRRTSHRQA 

S/SVGKWRRTQPVGTGPNL\YRKEYE/GEEAI 

LWQRSA\AEDQRTSTYNGDIRET\RTDQENKD 
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nucleotide 
location 
corresponding 
to last amino 
acid residue 
of peptide 
sequence 


Amino acid sequence (A=Alanine OCysteine, 
D=Aspartic Acid, E-Glutamic Acid, 
^Phenylalanine, G=OIycine, H=Histidine, 
I=Isoteucine, K=Lysine, L=Leucine, 
M=Methionine, N=Asparagine, P=Proline, 
Q=Glutarnine, R=Argbine, S=Serine, 
T=Threonme, V«Valine, W-Tryptophan, 
Y-Tyrosine, X-Unknown, *=Stop codon, 
^ssible nucleotide deletion, \=possible 
nucleotide insertion 














P^PWL^K^VLLSElFPTKIPRGELDMPVENGLR 

APVSAYQYALANGDVWKVHEVPDYSMAYG 

NPGVADATPPWSSYKEQSPQTLLELKRQRAA 

AKLLSHPFLSTHLGSSMARTOESSSEGKAPLI 

GGRTSPYS SNGTSVYYTVTSGDPPLLKFKAPI 

EEMEEKVHGCCRIS 


838 


2188 


A 


6865 


6291 


739 


AGPLEPRVQGAMALQLWALTLLGLLGAGAS 

LRPRKLDFFRSEKELNHLAVDEASGVVYLGA 

VNALYQLDAKLQLEQQVATGPVLDNKKCTP 

PIEASQCHEAEMTDNVNQLLLVDPPRKRLVE 

CGQLLKGI\CALRALSNISLRLFYEDGSGEKSF 

VASNDEGVATVGLVSSTGPGGDRVLFVGKG 

NGPHDNGIIVSTRLLDRTDSREAFEAYTDHAT 

YKAGYLSTNTQQFVAAFEDGPYVFFVFNQQD 

KliPARNRTLLARMCREDPNYYSYLEMDLQC 

RDPDfflAAAFGTCLAASVAAPGSGRVXYAVF 

SRDSRSSGGPGAGLCLFPLDEVHAKMEANRN 

ACYTGTREARD1FYKPFHGDIQCGGHAPGSSK 

SFPCGSEHLPYPLGSRDGLRGTAVLQRGGLN 

LTAVTVAAENNHTVAFLGTSDGRILKVYLTP 

DGTSSEYDSILVErNKRVKRDLVLSGDLGSLY 

AMTQDKVFRLPVQECLSYPTCTQCRDSQDPY 

CGWCWEGRCTRKAECPRAEEASHWLWSRS 

KSCVAVTSAQPQNMSRRAQGEVQLTVSPLPA 

LSEEDELLCLFGESPPHPARVEGEAVICMSPSS 

IPVTPPGQDHVAVTIQLLLRRGNIFLTSYQYPF 

YDCRQAMSLEENLPCISCVSNRWTCQWDLR 

YHECREASPNPEDGIVRAHMEDSCPQFLGPSP 

LV1PMNHETDVNFQGKNLDTVKGSSLHVGSD 

LLICFMEPYTMQESGTFAFRTPKLSHDANETL 

P1JHQLYVKSYGK24IDSKLHVTLYDCSFGRSDC 

SLCRAANPDYRCAWCGGQSRCVYEALCNTT 

SECPPPVTTRIQPETGPLGGOIRrnLGSNLGVQ 

AGDIQRISVAGRNCSFQPERYSVSTRIVCVIEA 

AETPFTGGVEVDVFGKLGRSPPNVQFTFQQP 

KPLSVEPQQGPQAGGTTLTIHGTHLDTGSQED 

VRVTLNGVPCKVTKFGAQLQCVTGPQATRG 

QMLLEVSYGGSPVPNPGIFFTYRENPVLRAFE 

PLRSFASGGRSmVTOQGFSLIQRFAMWIAEP 

LQSWQPPREAESLQPMTVVGTDYVFHNDTK 

WFLSPAVPEEPEAYNLTVL1EMDGHRALLRT 

EAGAFEYVPDPTFENFTGGVKKQVNKLIRAR 

GTNLNKAMTLQEAEAFVGAERCTMKTLTET 

DLYCEPPEVQPPPKRRQKRDTTHNLPEF1VKF 

GSREWVLGRVEYDTRVSDVPLSLILPLVIVPM 

VWIAVSVY CY WRKSQQAEREYEKIKSQLEG 

LEESVRDRCKKEFTDLMIEMEDQTNDVHEAG 

IPVLDYKTYTDRVFFLPSKDGDKDVMITGKL 

DIPEPRRPWEQALYQFSNLLNSKSFLINFIHT 

L\ENQPEFSARAKVYFASLLTVALHGKLEYYT 

DIMtlTLFLELLEQYVVAKNPKLMLRRSETVV 

ERMLSNWMSICLYQYLKDSAGEPLYKLFKAI 

KHQVEKGPVDAVQKKAKYTLNDTGLLGDD 

VEYAPLTVSVIVQDEGWAIPVKVLNCDnSQ 

VKEK1IDQVYRGQPCSCWPRPDSWLEWRPG 

STAQILSDLDLTSQREGRWKRVNTLMHYNVR 

DGATLILSKVGVSQQPEDSQQDLPGERHALL 

EEENRVWHLVRPTDEVDEGKSKRGSVKEKE 

RTKAITEIYLTRIXSVKGTLQQFVDNFFQSVL 

APGHAVPPAVKYFFDFLDEQAEKHNIQDEDTI 
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Amino acid sequence (A=*Alanme OCysteine, 
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Threonine, V=Valine, W=Tryptophan, 
Y-Tyrosine, X^Unknown, *=Stop codon, 
/=possible nucleotide deletion, possible 
nucleotide insertion 














fflWKTNSIJLRFWVNILKNPHFIFDVHVHEVV 

DASLSX^QTFMDACTRTEHKLSRDSPSNKLL 

YAKE1STYKKMVEDY YKG1RQMVQ VSDQDM 

NTHLAEISRAHTDSLKTLVALHQLYQYTQKY 

YDEIINALEEDPAAQKMQLAFRLQQIAAALE 

NKVTDL 


839 


2189 


A 


6872 




1485 


RARRLALQCHVCVCALTPGEQSGRRLPGQT 

WLMFSCPCFSLQDNSl« , SSlTV r rECDEDPVSLH 

EDQTDC S SLRDENNKENYPDAGALVEEHAPP 

SWEPQQQNVEATVLVDSVLRPSMGNFKSRKP 

KSIFKAESGRSHGESQETEHWSSQSECQVRA 

GTPAHESPQNNAFKCQET\VRL\QPRIDQRTAT 

SPKD AFETR\QDT -NEEEAAQVHGVKDPAPAS 

TQSVLA\DGTDSADPSPVHKDGQNEADSAPE 

DLHSVGTSRLUVYHITDGDNPTAVRHGCSL/F 

SGQSQRFNLDPESAPSPPSTQQFMMPRSSSRC 

SCGDGKEPQTITQLTKHIQSLKRKIRKFEEKFE 

QEKKYRPSHGDKTSNPEVLKWMNDLAKGRK 

QLKELKLKLSEEQGSAPKGPPRNLLCEQPTVP 

RENGKPEAAGPEPSSSGEETPDAALTCLKERJl 

EQLPPQEDSKVTKQDKNLIKPLYDRYRIIKQIL 

STPSLffTTVSQDTCMLLLCTDV 


840 


2190 


A 


6873 


2 

! 


2054 


FFRFYFSFIRLFAMSLADLTKTNIDEHFFGVAL 

ENNRRSAACKR5PGTGDFSRNSNASNKSVDY 

SRSQCSCGSLSSQYDYSEDFLCDCSEKAINRN 

YLKQPWKEKEKKKYNVSKISQSKGQKEISV 

EKKHTWNASLFNSQIHMIAQRRDAMAHRDLS 

ARLHKIKGLKNELADMHHKLEA1LTENQFLK 

QLQLRHLKAIGKYENSQNNtPQIMAKHQNEV 

KNLRQLLRKSQEKERTL SRKLRETDSQLLKT 

KDELQALQKLSEDKNLAEREELTHKLSnTTK 

MDANDKKIQSLEKQLRLNCRAFSRQLAIETR 

KTLAAQTATKTLQVEVKHLQQKLKEKDREL 

EIKNIYSHRILKNLHOTEDYPKVSSTKSVQAD 

RXILPFTSMRHQGTQKSDVPPI/TTKGKKATG 

NJDHKEKSTEINHEIPHCVNKLPKQEDSKRKY 

EDLSGEEKHLEVQILLENTGRQKDKKEDQEK 

KNIFVKEEQELPPKIIEVIHPERESNQEDVLVR 

EKFKRSMQRNGVDDIUGKGTAPYTKGPLRQ 

RRHYSFTEATENLHHGLPASGGPANAGNMR 

YSHSTGKHLSNREEMELEHS\DSGYEPSFGKS 

SRIK VKDTTFRDKKS SLMEELFGSG YVLKTD 

QSSPGVAKGSEEPLQSKESHPLPPSQASTSHA 

FGDSKVTWNSIBCPSSPTEGKRKHI 


841 


2191 


A 


6874 


3 


2867 


ssrtremeekeilrrOIrjLLQgliddyktlhg 
napapgtpaa sg wqpptyhs graf s aryprp 
srrgysshhgpswrkkyslvnrppgpsdppa 
dhavrplhgarggqppvpqqhvlerqvqls 

QGQNVVIKVKPPSKSGSASASGAQRGSLEEFE 

DTPWSDQRPREGEGEPPRGQLQPSRPTRARG 

TCSVEDPLLVCQKEPGKPRMVKSVGSVGDSP 

REPRRTVSESVIAVKASFPSSALPPRTGVALG 

RKLGSHSVASCAPQLLGDRRVDAGHTDQPVP 

SGSVGGPARPASGPRQAREASLVVTCRTNKF 

RKNNYKWVAASSKSPRVARRALSPRVAAEN 

VCKASAGMANKVEKPQLIADPEPKPRKPATS 

SKPGSAPSKYKWKASSPSASSSSSFRWQSEAG 

SKDHASQLSPVLSRSPSGD\RPALAHSGLKPLS 

GETPLSAYKVKTRTKIIRRRGSTSLPGDKKSG 

TSPAATAKSHLSl.RRRQALRGKSSPVLKKTFN 
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T«Threonine. V-Valine, W-Tryptophan, 
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/=possibJc nucleotide deletion, \=possible j 
nucleotide insertion 














kglVOVtkhrlcrlppsrahlptkeasslha 
vrtaptskviktryrivkktpasplsappfpls 
lpswrarrlslsrslvlnrlrpvasgggkaq 

PGSPWWRSKGYRCIGGVLYKVSANKXSKTSG 
QPSDAGSRPLLRTGRLDPAGSCSRSLASRAVQ 
RSLAIIRQARQRREKRKEYCMYYNRFGRCNR 
GERCPYIHDPEKVAVCTRFVRGTCKKTDGTC 
PFSHHVSKEKMPVCSYFLKGICSNSNCPYSHV 
YVSRKAEVCSDFLKGYCPLGAKCKKKHTLLC 
PDFARRGACFRGAQCQLLHRTQKRHSRRAAT 
SPAPGPSDATARSRVSASHGPRKPSASQRPTR 
QTPSSAALTAAAVAAPPHCPGGSASPSSSKAS 
SSSSSSSSPPASLDHEVAPSLQEAALAAACSNR 
LCKLPSFISLQSSPSPGAQPRVRAPRAPLTKDS 
GKPLHIKPRL 


842 


2192 


A 


6898 


506 


2071 


WPDLVHTWSSEEAMGSCCSCPDKDTVPDNH 

RNKFKV1NVDDDGNELGSGIMELTDTELILYT 

RKRDSVKWHYLCLRRYGYD SNLFSFESGRRC 

QTGQGIFAFKCARAEELFNMLQEIMQNNSIN 

WEEPVVERNNHQTELEVPRTPRTPTTPGFAA 

QNLPNGYPRYPSPGDASSHPSSRHPSVGSARL 

PSVGEESTHPLLVAEEQVHTYVNTTGVQEER 

KNRTSVHVPLEARVSNAESSTPKEEPSSIEDR 

DPQ1LLEPEGVKFVLGPTPVQKQLMEKEKLE 

QLGRDQVSGSGANNTEWDTGYDSDERRDAP 

SVNKLVYENINGLSIPSASGVRRGRLTSTSTSD 

TQNINNSAQRRTALLNYENLPSLPPVWEARK 

LSRDEDDNLGPKTPSLNGYHNNLDPMHNYV 

NTENVTVPASAHKIEYSRRRDCTPTVFNFDIR 

RPSLEHRQLNYIQVDLEGGSDSDNPQTPKTPT 

TPLPQTPTRRTELYAVIDIERTAAMSNLQKAL 

PRDDGTSRNKTRHNSTVDLPL 


843 


2193 


A 


6919 


2 


663 


AGRPGTTHASGKMAYQSLRLEYLQIPPVSRA 
YTTACVLTTAAVQLELITPFQLYFNPELIFKHF 
QIWRLITNFLFFGPVGFNFLFNMIFLYRYCRM 
LEEG SFRGRTADF VFMFLFGGFLMTLFGLF VS 
L/VFLGPGLYNN/GSSMCX}AE\EPLCPHELLRP 
SQLPGPLSALGAHGIFLWGELNHCGPFGYCS 
WTfflFFLGRCISQSTWWNKNSENTIYFESYF 


844 


2194 


A 


6928 


902 


366 


HRLCMPIQGACGERME/FSLLLPGLECNGVIL 
AHCNLRLPGSSNSPASASQVAGITGVCHHAR 
LIFVFSVETGFLHAGQAGLELLTSGDPPASAS 
QSAGITGKSQHTRPGYEFEPYSAAQEDALKA 
LM 


845 


2195 


A 


6939 


1660 


317 


LYPENLGESLFPILLLPPPWPDGGRPCCVEMS 

TRAKKLRRIWRJLEEKESVAGAVQTLLLRSQE 

GGV\TSAAASTLSEPPRRTQESRTRTRALGLPT 

LPMEKLAASTEPQGPRPVLGRESVQVPDDQD 

FRSFRSECEAEVOWNLTYSRAGVSVWVQAV 

EMDRTLHKIKCRMECCDVPAETLYDVLHD1E 

YRKKWDSNVIETFDIARLTVNADVGYYSVVR 

CPKPLKNTOVITLRSWLPMGADYIIMNYSVK 

HPKYPPRKDL VRAVSIQTG YLIQ STGPKSC V1T 

YLAQVDPKGSLPKWVVNKSSQFLAPKAMKK 

MYKACLKYPEWKQKHL\PHFKPWL\HPEQSP 

LPSLALS\ELSVQHADS\LENIDESAV\AESREE 

RVMGGAGGEGXSDDDTSLYAEAPHRFRETETG 

PGAGRALGAAAAPALSPLHPPGTWWHRARP 

RRVLQPGWTEPQ 
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F-Pheriylalanine, G«Glycine, H-Histidine, 
Wsoleucine, K=Lysine, L=Lcucine, 
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Y=Tyrosine, X=Unknown, *=Stop codon, 
/^possible nucleotide deletion, possible 
nucleotide insertion 


146 


2196 


A 


" 6944 


42 


2672 


RRKMAOCROSICCCCPWCCCCGERETRTPE" - ' 
ELTILGETQEEEDEILPI^\iEDYDRC^ 
YLEVLETMDNKKGRRYEAVKWMVVFAIGV 
CTGLVGLFVDFFVRLFTQLKFGWQTSVEECS 
QKGCLALSLLELLGFNLTFVFLESLLGUEPVE 
AGSGITEGKCYLYARQVPGLVRLPTLLWKAL 
GVLLTVAAMLU\GLGSPMIHSGSWGAGLPQ 
EQSISLRKIQFNFPYFRSDRYGK\DKRDFVSAG 
AAAGVAAAFGAPIGGTLFSLEBGSSFWNQGL 
TWKVLFCSMSATFTLNFFRSGIQFGSWGSFQL 
PGLLNFGEFKCSDSDKKCHLWTAMDLGFFV 
VMGVIGGLLGATFNCLNKRLAKYRMRNVHP 
KPKLVRVLESLLVSLVriVVVFVASMVLGEC 
RQMSSSSQIGNDSFQLQVTEDVNSSIKTFFCP 
NDTYNDMATLFFNPQESAILQLFHQDGTFSPV 
TLALFFVLYFLLACWTYGISVPSGLFVPSLLC 
GAAFGRLVANVLKSYIGLGHIYSGTFALIGAA 
AFLGGWRMTTSLTVTLIES'nNErrYGLPIMVT 
LMVGKWTGDFFNKGIWDIHVGLRGVPLJLEW 
ETEVEMDKLRASDIMEPNLTYVYPHTR1QSLV 
SILRTTVHHAFPWTENRGNEKEFMKGNQLIS 
NNIKFKKS SILTRAGEQRKRSQSMKS YPS SEL 
RNMCDEHUSEEPAEKEDLLQQMLERRYTPY 
PNLYPDQSPSEDWTMEERFRPLTFHGLILRSQ 
. LVTLLVRGVCYSESQSSASQPRLSYAEMAED 
YPRYPDIHDLDLTLLNPRMIVDVTPYMNPSPF 
TVSPNTHVSQVFNLFRTMGLRHLPVVNAVGE 
IVGHTRHNLTYEFLQARLRQHYOTI 


847 


2197 


A 


6951 


3 


1994 


NTNSSSVTNSAAGVEDLNIVQVTVPDNEKER 

LSSIEKIKQLREQVNDLFSRKFGEAIGVDFPVK 

VPYRKITFNPGCWIDGMPPGVVFKAPGYLEI 

SSMRRILEAAEFIKFTV1RPLPGLELSNGEYST 

VGKRKIDQEGRVFQEKWERAYFFVEVQNIST 

CLICK^SMSVSKEYNLRRHYQTNHSKHYDQY 

MERMRDE1CLHELKKGLRKYLLGLSDTECPE 

QKQVFANPSPTQKSPVQPVEDLAGNLWEKLR 

EKJRSFVAYSIAIDEITDINNTTQLAIFIRGVDE 

NFDVSEELLDTVPMTGTKSGNEIFSRVEKSLK 

NFCINWSKLVSVASTGTPPMVDANNGLVTia 

KSRVATFCKGAELKSrCClIHPESLCAQ\KLKM 

DHVMDVWKSVNWICSRGLNHSEFTTLLYEL 

DSQYGSLLYYTEIKWLSRGLVLKRFFESLEEI 

DSFMSSRGKPLPQLSSID^TRDLAFLVDMTM 

HLNALNISLQGHSQIVTQMYDLIRAFLAKLCL 

WETHLTRNNLAHFPTLKLVSRNESDGLNYIP 

KIAELKTEFQKRLSDFKLYESELTLFSSPFSTKI 

DSVHEELQMEYIDLQCNTVLKTKYDKVG1PE 

FYKYLWGSYPKYKHHCAKILSMFGSTYICEQ 

LFSIMKLSKTKYCSQLKDSQWDSVLHIAT 


848 


2198 


A 


6985 


3 


2*9 


SVQYLPGRPTRTHASTDAPLMLKFTPLTSKTK 
ASAPVQCLLLMAATFSPQGLAKPHSGTIPIT\C 
CFNAINTK1PIQRLESYTR1TNIQCPKEAVM 


849 


2199 


A 


6999 


963 


5 


LDFLCHRDMGDNITSITEFLLLGFPVGPRIQM 

LLFGLFSLFYVFTLLGNGHLGLISLDSRLHAP 

MYFFLSHL\AVVDIAYACNTVPRMLVNLLIIP 

AKPISFAGRMMQTFLFSTFAVTECLLLWMS 

YDLYVWCHPLRYLAIMTWRVCITLAVTSWT 

TGV1XSUHLVLLLPLPFCRPQKIYHFFCEILA 

VLKIAC^THINENMVLAGAISGLVGPLSTIV 

VSYMCILCAILQIQSREVQRKAFCTCFSHLCVI 
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GLFYGTAIIMYVGPRYGNPKEQKKYL1XFHS 
LFNPMLNPLICSLRNSEVKNTLKRVLGVERAL 


850 


2200 


A 


7001 


1 


1011 


MGNDSVSYEYGDYSDLSDRPVDCLDGACLAl 

DPLRVAPLPLYAAIFLVGVPGNAMVAWVAG 

KVARRRVGATWLLHLAVADLLCCLSLPILAV 

PIARGGHWPYGAVGCRALPSIILLTMYASVLL 

LAALSADLCFLALGPAWCLRFS/GACGVQVA 

CGAAWTLALLLTVPSAIYRRLHQEHFPARLQ 

CWDYGGSSSTENAVTAIRFLFGFLGPLVAVA 

SCHSALLCWAARRCRPLGTAIWGFFVCWAP 

YHLLGLVLTVAAPNSALLARALRAEPLIVGL 

AIAHSCLNPMLFLYFGRAQLRRSLPAACHW 

ALRESQGQDESVDSKiCS 15HDL VSEME V 


851 


2201 


A 


701 i 


1 


2310 


AAASPLl^RKGlPlRAfeVCADCSAlPDPGWASI 

SRGYLVCDECCSVHRSLGRHIS1VKHLRHSA 

WPPTLLQMVHTLASNGANSIWEHSLLDPAQV 

QSGPALKQTPKDKVNHPIKSEFIRAKYQMLAF 

VHKLPCRDDDGVTAKDLSKQLHSSVRTGNLE 

TCLRLL SLGAQANFFHPEKGTTPLHV A4KAG 

QTLQAELLWYGADPGSPDVNGRTPIDYARQ 

AGHHELAERLVECQYELTDRLAFYLCGRKPD 

HKNGHYIIPQMADSLDLSELAKAAKKKLQAL 

SNRLFEELAMDVYDEVDRRENDAVWLATQN 

HSTLVTERSAVPFLPVNPEYSATRNQGRQKL 

ARFNAREFATLIIDILSEAKRRQQGKSLSSPTD 

NLELSLRSQSDLDDQHDYDSVASDEDTDQEP 

LRSTGATRSNRARSMDSSDLSDGAVTLQEYL 

ELKKALATSE AKVQQLMKVNS S LSD ELRRLQ 

REIHKLQAENLQLRQPPGPVPTPPLPSERAEH 

TPMAPGGSTHRRDRQAFSMYEPGSALKPFGG 

PPGDELTTRLQPFHSTELEDDAIYSVHVPAGL 

YRIRKGVSASAVPFTPSSPIXSCSQEGSRHTSK 

LSRHGSGADSDYENTQSGDPLLGLEGKRFLE 

LGKEEDFHPELESLDGDLDPGLPSTEDVILKT 

EQVTKNIQELLRAAQEFKHDSFYPCSEKIHLA 

VTEMASLFPKRPALEPVRSSLRLLNASAYRLQ 

SECRKTVPPEPGAPVDFQLLTQQVIQCAYDIA 

KAAKQLVTTITREKKQ 


852 


2202 


A 


7016 


484 


1777 


RISKIQVYYSTGYSSRKMNPTLGLAIFLAVLL 

TVKGLLKPSFSPRNYKALSEVQGWKQRMAA 

KELARQNMDLGFKLLKKLAFYNPGRNIFLSP 

I^ISTAFSMLCLGAQDSTLDEIKQGFNFRKMP 

EKDLHEGFHYIIHELTQKTQDLKLSIGNTLFID 

QRLQPQRKFLEDAKNFYSAETILTNFQNLEM 

AQKQINDFI/ESKTHGKJNNL1ENIDPGTVMLL 

ANYIFFRARWKHEFDPNVTKEEDFFLEKNSS 

VKVPMMFRSGIYQVGYDDKLSCTILEIPYQK 

NITAIFILPDEGKLKHLEKGLQVDTFSRWKTL 

LSRRVVDVSVPRLHMTGTFDLKKTLSYIGVS 

KIFEEHGDLTKIAPHRSLKVGEAVNKAELKM 

DERGTEGAAGTGAQTLFMETPLVVKIDKPYL 

LLIYSEKIPSVLFLGK1VNPIGK 


853 


2203 


A 


7017 


1 


3293 


" "MTHACNPSTLGGQGRRITRSHGRRRSSRGPV 
ARHVAAGAGHENKHGGSRRFPAGVAPRRAM 
ANVSKKVSWSGRDRDDEEAAPLLRRTARPG 
GGTPLLNGAGPGAARQSPRSALFRVGHMSSV 
ELDDELLEPuOMDPPHPFPKEIPHNEKLLSLKY 
ESLDYDNSENQLFLEEERRINHTAFRTVEIO. 
WVICALIGILTGLVACFIDIWENLAGLKYRVI 
KGSILPNIDKFTEKGGLSFSLLLWATLNAAFV 
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WRLKTLVIKVSGVILSVVGGLAVGKEGPMI 

HSGSVIAAGISQGRSTSLKRDFK1FEYFKRDTE 

KRDFVSAGAAAGVSAAFGAPVGGVLFSLEEG 

ASFWNQFLTWIUFFASMISTFTLNFVLSIYHG 

NMWDLSSPGLINFGRFDSEKMAYTIHEIPVFI 

AMGVVGGVLGAVFNALNYWLTMFRIRYIHR 

PCLQVIEAVLVAAVTATVAFVLIYSSRDCQPL 

QGGSMSYPLQLFCADGEYNSMAAAFFNTPEK 

SWSLFHDPPGSYNPLTLGLFTLVYFFLACWT 

YGLTVSAGVFIPSLLIGAAWGRLFGISLSYLTG 

AAIWADPGKYALMGAAAQLGGIVRMTLSLT 

VIMMEATSNVTYGFPIMI.VI.MTAKIVGDVFIE 

GLYDMHIQLQSVPFLHWEAPVTSHSLTAREV 

MSTPVTCLRRREKVGVIVDVLSDTASNHKGF 

PWEHADDTQPARLQGLDLRSQUVLLKHKVF 

VERSNLGLVQRRLRLKDFRDAYPRFPPIQSIH 

VSQDERECTMDLSEFMNPSPYTVPQEASLPR 

VFKLFRALGLRHLVWDNRNQWGLVTRKD 

LARYRLGKRGLEEtSLAQTGPKAQATAEGRV 

AGAAQQPCQLRAVTLEDLGLLLAGGLASPEP 

LSLEELSERYESSHPTSTASVPEQDTAKHWNQ 

LEQWWELQAEVACLREHKQRCERATRSLL 

REUQVRARVQLQGSELRQLQQEARPAAQAP 

EKEAPEFSGLQNQMQALDKRLVEVREALTRL 

RRRQVQQEAERRGAEQEAGLRLAKLTDLLQ 

QEEQGREVACGALQKNQEDSSRRVDLEVAR 

M 


854 


2204 


A 


7037 


139 


2604 


-AGT-WEPRPYDQAKETGAPGSQP-PV^PMELRP - 
WLLWVVAATGTLVLLAADAQGQKVFTNTW 
AVRIPGGPAVANS VARKHGFLNLGQIFGDYY 
HFWHRGVTKRSLSPHRPRHSRLQREPQVQWL 
EQQVAmTKRDVYQEPTDPKFPQQWYUSG 
VTQ\RDLMVKAAWAQGYTGHG1WSILDDGI 
EKNHPDLAGNYDPGASFDVNDQDPDPQPRY 
TQMNDNRHGTRCAGEVMVANNGVCGVGY 
AYNARIGGVRMLDGEVTDAVEARSLGLNPN 
HIHIYSASWGPEDDGKTVDGPARLAEEAFFR 
GVSQGRGGLGSIFVWASGNGGREHDSCNCD 
GYTNSIYTLSISSATQFGNVPWYSEACSSTLA 
TTYSSGNQNEKQIVTTDLRQKCTESHTGTSAS 
APLAAGI[ALTLEANKNLTWRDMQHLWQTS 
KPAHLNANDWATNGVGRKVSHSYGYGLLD 
AGAMVALAQNWTTVAPQRKCIIDILTEPKDI 
GKRLEVRKTVTACLGEPNHTTRLEHAQARLT 
LSYNRRGDLAIHLVSPMGTRSTLLAARPHDY 
SADGFNDWAFMTTHSWDEDPSGEWVLEIEN 
TSEANNYGTLTKFTLVLYGTAPEGLPVPPESS 
GCKTLTSSQACWCEEGFSLHQKSCVQHCPP 

CATCQGPALTDCLSCPSHASLDPVEQTCSRQS 
QSSRESPPQQQPPRLPPEVEAGQRLRAGLLPS 
HLPEWAGLSCAFIVLVFVTVFLVLQLRSGFS 
FRGVKVYTMDRGLISYKGLPPEAWQEECPSD 
SEEDFXjRGERTAFIKDQSAL 


855 


2205 


A 


7058 


3 


1441 


QRPASQLLAPFAAEALPGAPRAAMAQHFSLA 
ACDWGFDLDHTLCRYNLPESAPLIYNSFAQF 
LVKEKGYDKELLNVTPEDWDFCCKGLALDL 
EDGNFLKL ANN GTVLRASHGTKMMTPE VL A 
EAYGKKEWKHFl^SDTGMACRSGKYYFYDN 
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^Phenylalanine, G=Glycinc, H=Histidine, 
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nucleotide insertion 














YFDLPGmeARV^YLTkLN^GQKTFDFW 

KDIVAAIQHNYKMSAFKENCGIYFPEIKRDPG 

RYLHSRPESVKKWLRQLKNAGKILLLITSSHS 

DYCRLLCA\YLLGNDFTDLFDIVITNALKPGFF 

SHLPSQRPFRTLENDEEQEALPSLDKPGWYSQ 

GNAVHLYELLKKMTGKPEPKWYFGDSMHS 

DIFPARHYSNWETVLILEELRGDEGTRSQRPE 

ESEPLEKKGKYEGPKAKPLNTSSKKWGSFFU 

DSVLGLENTEDSLVYTWSCKRISTYSTIAIPSI 

EAIAELPLDYKFTRFSSSNSKTAGYYPNPPLV 

LSSDETLISK 


856 


2206 


A 


70$2 


396 


1635 


SSPSVFfiPEHAVOPVFTMEFLKTCVLRRNACT 

AVCFWRSKWQKPSVRRISTTSPRSTVMPAW 

VIDKYGKNEVLRFTQNMMMPI IHYPNE VI VK 

VHAASVNPIDVNMRSGYGATALNMKRDPLH 

VKIKGEEFPLTLGRDVSGVVMECGLDVKYFK 

PGDEVWAAVPFWKQGTLSEFWVSGNEVSH 

KPKSLTHTQAASLPYVALTAWSAINKVGGLN 

DKNCTGKRVLILGASGGV GTFAIQVMKAWD 

AHVTAVCSQDASELVRKLGADDVTDYKSGSV 

EEQLKSLKPFDFILDNVGGSTETWAPDFLKK 

WSGATYVTLVTPFLLNMDRLGIADGMLQTG 

VTVGSKALKHFWKGVHYRWAFFMASGPCL 

DDIAELVDAGKIRPVUEQTFPFSKVPEAFLKV 

ERGHARGKTVINVV 


857 


2207 


A 


7088 


320 


2417 


LRRRKMTPQSLLQTTLFLLSLLFLVQGAHGR 

GHREDFRFCSQRNQTHRSSLHYKPTPDLRISIE 

NSEEALTVHAPFPAAHPASRSFPDPRGLYHFC 

LYWNRIIAGRLHLLYGKRDFLLSDKASSLLC^" 

QHQEESLAQGPPLLATSVTSWWSPQNISLPSA 

ASFTFSFHSPPHTGAHNASVDMCELKRDLQL 

LSQFLKHPQKASRRPSAAPASQQLQSLESKLT 

SVRFMGDMGSFEEDRINATYWKLQPTAGLQ 

DLHIHSRQEEEQSEIMEYSVLLPRTLFQRTKG 

RSGEAEKRLLLVDFSSQALFQDKNSSQVLGE 

KVLGIWQNTKVANLTEPWLTFQHQLQPKN 

VTLQCVFWVEDPTLSSPGHWSSAGCETVRRE 

TQTSCFCNHLTYFAVLMVSSVEVDAVHKHY 

LSLLSYVGCWSALACLVTIAAYLCSRVPLPC 

RRKPRDYTIKVHMNLLLAVFLLDTSFLLSEPV 

ALTGSEAGCRASAIFLHFSLLTCLSWMGLEG 

YNLYRLWEVFGTYVPGYLLKLSAMGWGFPI 

FLVTL V AL VD VDN YGPIIL AVHRTPEG VIYP S 

MCWIRDSLVSYITNLGLFSLVFLFNMAMLAT 

MVVQILRLRPHTQKWSHVLTLLCLSLVLG\LP 

WALIFFSFASGTFQLWLYLFSHTSFQGFLIFI 

WYWSMRLQARGGPSPLKSNSDSARLPISSGS 

TSSSRI 


858 


2208 


A 


7091 


185 


415 


DAGAVKSSDTNI WFRGMCDDKKGHRCPS* G 

QPQHFHVAFHTEAEGAMFYFRLHVIHRVMQS 

QQQLFPSTLFSWLLE 


859 


2209 


A 


7136 


3 


302 


FFFWRQSLALLPRLECSGATGAHCNLHFPGSS 
DCPTS AS* IAGITGACYTIA WLLFVFLAETGFH 
HVGQGGLELLTSSDPSGSASQSAGITGVSHCT 
WPI 


860 


2210 


A 


7156 


23 


591 


AI^TETRTFDMRRLLLVTSLVVVLLWEAGAV 
PAPKVPIKMQVKHWPSEQDPEKAWGARVVE 
PPEKDOQLVVLFPVQKPKLLTTEEKPRGQGR 
GPILPGTKAWMETEDTLGRVLSPEPDHDSLY 
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Amino acid sequence (A=Alamne OCysteine, 
D^Aspartic Acid, E=Glutamic Acid, 
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HpPEEE^QEEPJ?PJUWVMPNHQYliGPEBDQ 
DHIYHPQ* GSRGHHCPRPVPRPRLLGLGPSLP 
CPS j 


&6L 


2211 


A 


7161 


1220 


1003 


NYVCTIAF*EKKMGF*LSLSCLVLLFVLFLDCI 
LTTTTRIMFHCTYLFASVCLSLLNTLLSPNCL 
KSAMILQ . 


862 


2212 


A 


7211 


665 


847 


LKYYHITMG1YKTGKKVIL*KSSMSNRFSVIF 
YKKIQKLSFSNYVYHQNYVFSSDWSYDF 


863 


2213 


A 


7212 


924 


1273 


HGSSCALGDLAPG*LPSGPVLSSPAVRL*RKP 
LVWDSPSCLPATGPT*GLVLVLGGPDCPWA 
RGQHEHKRMRAP* SCRVTVNL AKKKKKTOQ 
COCPNYQSPPKECDYNILANSVA 


864 


2214 


A 


7214 


845 


\i\9 


SDKGGKKADRKNHLRHAFPLLPHRVRERLH 
DPKVPVDADHVQGQDPGRAAHDIHGEDVTE 
KVSKDPLAPDEVGDTDEGHDRHGHREVGQR 
HGHDQEEVAYEERACEGGKFATVEVTDKPV 
DEALREAMPKVAKYAGGTNDKGIGMGMTV 
PISFAVFPNEDGSLQKKLKVWFRIPNQFQSDP 
PAPSDKSVKJEEREGITVYSMQFGGYAKEAD 
YVAQATRLRAALEGTATYRGDIYFCTGYDPP 
MKPYGRRNEIWLLKT 


865 


2215 


A 


7246 


559 


682 


RRLGAVAHAYTSSTLGGRGG WIT* GQELQTS 
LANMAKPRLY 


866 


2216 


A 


7257 


641 


1310 


TCTYKYLMGWIRGRRSRHSWEMSEFHNYNL 

DLKKSDFSTRWQKQRCPWKSKCRENASPFF 

FCCFIAVAMGIRFUMYAIWSAVFLNSLFNQEV 

QIPLTESYCGPCPKNWICYKNNCYQFFDESKN 

WYESQASCMSQNASLLKVYSKEDQDLLKLV 

KSYHWMGLVHIK1NGSWQWEDGS1LSPNLLT 

IIEMQKGDCALYASSFKGYIENCSTPNTYICM 

QRTV 


867 


2217 


A 


7288 


151 


396 


SIKflEAFGSNGPDFWFFRYWSP*LFRQQWFl 
MPFFQTLWLMNANRFCS1FTTTNVANNCWW 
TPYHCWLSWVCRCESHGI 


868 


2218 


A 


7298 


3 


272 


PDTVIGGRGSGGKEFGRWVLW*VFE*RLGTP 
KGSCPAGGSRMVSESD* EGRGC* ASYPCAC* 
AGS*WR*GSRPAGRGTPPRSLSHARPP 


869 


2219 


A 


7332 


1223 


332 . 


PRRDAEDRDESCLNPAFPIGLLHPNSVNSMAR 

FLTLCrWIiLLGPGLLArVRAECSQDCA'rcS 

YRLVRPADINFLACVMECEGKLPSLKJWETC 

KELLQLSKPELPQDGTSTLRENSKPEESHLLA 

KRYGGFMKRYGGFMKKMDELYPMEPEEEA 

NGSEILAKRYGGFMKKDAEEDDSLANSSDLL 

KELLETGDNRERSHHQDGSDNEEEVSKRYGG 

FMRGLKRSPQLKEKAKELQKRYGGFMRRVG 

PQKW*MTSPQNRYGGFLKRFAEALPSDEEGE 

SYSKEVPEMEKRYGGFMRF 


870 


2220 


A 


7382 


216 


1018 


EIHQRLTERTQFLDESRKNPNS*QANLLRGGG 

AGQGRGREGAESGGSRGEGPGSDGRLPATGD 

FWSPRSQRRGCCGRRAPRPEAMENGA V YSPT 

"EEEDPGPARGPRSGLAAYFFMGRLPLLRRVL 

KGLQLLLSLLAFICEEWSQCTLCGGLYFFEF 

VSCSAFLLSLLHIVYCTPFYERVDTTKVKSSD 

FYITLGTGCVFLLASIIFVSTHDRTSAEIAATVF 

GFIASFMFLLDFITMLYEKRQESQLRKPENTT 

RAEALTEPLNA 


871 


2221 


A 


7403 


3 


393 


SCAMCSGLL* LLLPIWLSWTLGTRGSfefRS Vr4 
DPGNMSFVKETVDKLLTGFRCFREREAAPRR 
ALRGAALPGESEAGDPESLRSSVNADWIQYS 
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DLWEAEVSTPRCEAGFCQECFRTPGNQEKDG 
PFIC 


872 


2222 


A 


7413 


1061 j 


359 


FVD1VSVVEFPHCPEARFPAQHGQDSKRLTLC 

PGGS*PQATLHLDRMRVSASPTKEIQVKKYK 

CGLIKPCPANYFAFKIC SGAANW GPTMCFED 

RMMSPVKNNVGRGLNIALVNGTTGAVLGQ 

KAFDMYSGDVMHLVKFLKEIPGGALVLVAS 

YDDPGTKMNDESRKLFSDLGSSYAKQLGFRD 

SWVFIGAKDLRGKSPFEQFLKEQPQTQNKYE 

GWPELLEMEGCMPPKPF 


873 


2223 


A 


7429 


2242 


2394 j 


ILKCAGHGGSCL* SQHFGRLRWEDRLRLGVQ 
DHPGQHCETPSLLKIERKLF 


874 


2224 


A 


7468 


146 


894 


PCI'SCVLWATLHLPASTRKAPQAECGMiSriE 
WQKIGVGITGFGIFFILFGTLLYFDSVLLAFGN 
1XFLTGLSLIIGLRKTFWFFFQRHKLKGTSFLL 
GGWIVLLRWPLLGMFLETYGFFSLFKGFFPV 
AFGFLGNVCNIPFLGALFRRLQGTSSMV*KTE 
MSSLNLDHWLKGAKREEWEPPPQSPALTHSP 
TYPGPPQV QKERNGAEQLTSNPQ VDSRGCQE 
AEMQTPRRLGWGWYHTLTLYLWEEK ' 


875 


2225 


A 


7498 


91 


251 


GEKPVPTWLQDEAGQWLLGFVAQPWGWPG 
SERHEP*HGGVLFRLGPSAPPGKL 


876 


2226 


A 


7544 


403 


587 


ysclcflfkhitsfknsvhiwlgtvvhaynpn 
ilggqgg\via*gqefktslgntvrpclyk: 


877 


1227 


A 


7566 


2 


940 


GCAPDTRFFVPEPGGRGAAPWVALVARGGG 

TFKDKVLVAARRNASAWLYNEERYGNITLP 

MSHAGTGNIWIMISYPKGREILELVQKGIPV 

TMTIGVGTRHVQEFISGQSWFVATAFITMMII 

SLAWLIFYYIQRFLYTGSQIGSQSHRKETKKVI 

GQLLLHTVKHGEKGIDVDAENCAVC1ENFKV 

KDUIULPCKHIFHRICIDPWLLDHRTCPMCKL 

DVIKALGYWGEPGDVQEMPAPESPPGRDPAA 

NLSLALPDDDGSDESSPPSASPAESEPQCDPSF 

KGDAGENTALLEAGRSDSRHGGPIS 


878 


2228 


A 


7586 


315 


1232 


ERSLLCK\T)VRWIYYSEGTKTQRRHRQGSLR 

RGRMQAACWYVLFULQPTVYLVTCANLTNG 

GKSELLKSGSSKSTLKHIWTESSKDLSISRLLS 

QTFRGBCENDTDLDLRYDTPEPYSEQDLWDW 

LRNSTDLQEPRPRAKRRPrVKTGKFKKMFGW 

GDFHSNIKTVKLNLLITGKIVDHGNGTFSVYF 

RHNSTGQGNVSVSLVPPTKIVEFDLAQQTVID 

AKDSKSFNCRffiYEKVDKATKNTLCNYDPSK 

TCYQEQTQSHVSWLCSKPFKVICIY1SFYSTD 

YKLVQKVCPDYNYHSDTPYFPSG 


879 


2229 


A 


7605 


479 


391 


TESWKLKWWSPTCLDQLNGSAPGNVF1HG 


880 


2230 


A 


7612 


93 


659 


DAAVAMTAQGGLVANRGRRFKWAIELSGPG 

GGSRGRSDRGSGQGDSLYPVGYLDKQVPDTS 

VQETDRILVEKRCWDIALGPLKQIPMNLFIMY 

MAGKTISIFPTMMVCMMAWRPIQALMAISAT 

FKMLESSS QKFLQGLV YHuNLMGLALA Y Yiv 

CQSMGLLPTHASDWLAFIEPPERMEFSGGGL 

LL 


881 


2231 


A 


7615 


291 


1452 


SPQKTMRSHTITMTTrSVSSWPYSSHRMRFIT 
NHSDQFPQNFSATPNVTTCPMDEKLLSTVLTT 
SYSVIFIVGLVGNIIALYVFLGIHRKRNSIQIYL 
LNVAlADLLLIFCLPFRIIvmilNQNKWIXGVIL 
CKVVGTLFYMNMYISDLLGFISLDRY1KINRSI 
QQRKATTTKQSIYVCCIVWMLALGGFLTM1IL 
TLKKGGHNSTMCHIYRDKHNAKGEAIFNFIL 
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D=Aspartic Acid, E=Glutamic Acid, 
^Phenylalanine, CM>lycine, H«Histidine, 
Msoleucine, K«Lysinc, L=Uucinc, 
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Y^Tyrosine, X=Unknown, *=Stop codon, 
/-possible nucleotide deletion, Wpossible 
nucleotide insertion 














VVMFWJjPLLirr^vivinirxrT t btcv.ddovut>\t H 

SGKYATTARNSFIVLIIFTICFVPYHAFRFIYISS 
QLNVSSCYWKEIVHKTNEIMLVLSSFNSaDP 
VMYRMSSNIRKIMCQLLFRRFQGEPSRSEST 
SEFKPGYSLHDTSVAVKIQSSSKST 


"882 


2232 


A 


7617 


67 


379 


RQMAIXKAhKDLISAGLKEFSVLLNQQVFND 
PLVSEEDMVTVVEDWMNFYINYYRQQVTGE 
PQERDKALQELRQELNTLANPFLAKYRDFLK 
SHELPSHPPPSS 


883 


2233 


A 


7622 


400 


215 


KVKTCRYNPKYSAANDTOFVDIPSREKDLAK 
AVATVGPISVAVGASHVFFQFYKKGKHLSS 


884 


2234 


A 


7638 


2640 


2861 


APVLILQMVKlSlVLtPCjFLSHDQGOLTKELQ 
QHVKSVTCPCEYLRKVSECRQMGPGALEQFP 
GLSCHTSHSG 




2235 


A 


7642 


201 


455 


PSRGKMELEAMSRYTSPVNPAVFPHLTVVLL 
AIGMFFTAWFFVYEVTSTKYTRDIYKELLISL 
VASLFMGFGVLFLLLWVGIYV 


886 


2236 


A 


7692 


61 


569 


APENPFSRQHFNSETKVKLSLKTGTWLGNHA 

HLGEHFSTHHELGLSGKVVGFLVKNILEVIRN 

GGMETRHPGKVSSWFHRWDSRAEQHNHAE 

HHEDVPQGDEDSKVSEAQQEFPDWTCAGLP 

GLLPKALRVLLFQLKVQHRPGIHQQRPEQQD 

VSDHRYGRSVRQNRK 


887 


2237 


A 


7693 


85 


315 


NPGCCLPVAMRTSYLLLFTLCLLLSEMASGG 

NFLTGLGHRSDHYNCVSSGGQCLYSACPIFTK 

IQGTCYRGKAKCCK 


"888 


2238 


A 


7702 


242 


1298 


APSHRRRYLSPSRSAGQLGNMALERLCSVLK 

VLLITVLWEGIAVAQKTQDGQNIGIKHIPAT 

QCGIWVRTSNGGHFASPNYPDSYPPNKECIYI 

LEAAPRQR1ELTFDEHYYIEPSFECRFDHLEVR 

DGPFGFSPLIDRYCGVKSPPLIRSTGRFMWIKF 

SSDEELEGLGFRAKYSFffDPDFTYLGGILNPIP 

DCQFELSGADGIVRSSQTCQEEKTKPGQAVD 

CIWTIKATPKAKIYLRFLDYQMEHSNECKRNF 

VAVYDGSSSIENLKAKFCSTVANDVMLKTGI 

GVIRMWADEGSRLNRFRMLFTSFGGASPAQA 

ALSFCHSNMCINNSLVCNGVQNCAYPWDEN 

HC 


889 


2239 


A 


7707 


185 


2911 


CHY1MNPSTHHPASAGGS1LGLFDFFGLGLGE 

MTMDALLARLKLLNPDDLREEIVKAGLKCGP 

ITSTTRFIFEKKLAQALLEQGGRLSSFYHHEA 

GVTALSQDPQRILKPAEGNPTDQAGFSEDRDF 

GYSVGLNPPEEEAVTSKTCSVPPSDTDTYRAG 

ATA SKEPPLYYG VCPVYEDVPARNERI YVYE 

NKKEALQAVKMDCGSRFKAFSTREDAEKFAR 

GICDYFPSPSKTSLPLSPVKTAPLFSNDRLKDG 

LCLSESETVNKERANSYKNPRTQDLTAKLRK 

AVEKGEEDTFSDLrWSNPRYLIGSGDNPTIVQ 

cajv^iv in v jvin v AAKIlINv^Ao ILV^L 1 LD V LENP 

DFMRtMYPDDDEAMLQKRIRYVVDLYLNTP 

DKMGYDTPLHFACKFGNADWNVLSSHHLI 

VKNSRNKYDKTPEDVICERSKNKSVELKERIR 

EYLKGHYYVPLLRAEETSSPVIGELWSPDQTA 

EASHVSRYGGSPRDPVLTLRAFAGPLSPAKAE 

DFRKLWKTPPREKAGFLHHVKKSDPERGFER 

VGRELAHELGYPWVEYWEFLGCFVDLSSQE 

GLQRLEEYLTQQEIGKKAQQETGEREASCRD 

KATTSGSNSISVRAFLDEDDMSLEEIKNRQNA 

ARNNSPPTVGAFGHTRCSAFPLEQEADLIEAA 
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Amino acid sequence (A=Alanine C=Cysteine, 
D-Aspartic Acid, EKJlutamte Acid, 
^Phenylalanine, G-Glycbe, H-Histidine, 
Msoleucine, KHLysine, LHLeucinc, 
M-Methionine, N«Asparagine, P=Proline, 
Q^Glutamine, R«Arginine, S=Serine, 
^Threonine, V=Valine, W=Tryptopban, 
Y=Tyrosine, X^Unknown, +=Stop codon, 
/^possible nucleotide deletion, V=possible 
nucleotide insertion 














EPGGPHSSRNGLCHPLNHSRTLAGKRPKAPK 

GEEAHLPPVSDLTVEFDKLNLQNIGRS V SKTP 

DESTKTKDQILTSRINAVERDLLEPSPADQLG 

NGHRRTESEMSARIAKMSLSPSSPRHEDQLEV 

TREPARRLFLFGEEPSKLDQDVLAALECADV 

DPHQFPAVHRWKSAVLCYSPSDRQSWPSPAV 

KGRFKSQLPDLSGPHSYSPGRNSVAGSNPAKP 

GLGSPGRYSPVHGSQLRRMARLAELAAL 


890 


2240 


A 


7711 


360 


249 


RHMPVIPALWEAEVGGLLEPRSSRSAWATE 


891 


2241 


A 


7721 


61 


U75 


KLPWEPSFLDCMQIIRHSEQTLKTALISKNPVL 

VSQYEKJUDAGEQRLMNEAFQPASDLFGPITL 

HSPSDWITSHPEAPQDFEQFFSDPYRKTPSFN 

KRSIYIQSIGSLGNTRIISEEYIKWLTGYCKAYP 

YGLRVKXLEPVPVSVTRCSFRVNENTHNLQIH 

AGDILKFLKKKKPEDAFCVVGITMIDLYPRDS 

WNFVFGQASLTDGVG1FSFARYGSDFYSMHY 

KGKVKKLKKTSSSDYSIFDNYYIPEITSVLLLR 

SCKTLTHEIGHIFGLRHCQWLACLMQGSNHL 

EEADRRPLNLCPICLHKLQCAVGFSIVERYKA 

LVRWIDDESSDTPGATPEHSHEDNGNLPKPV 

EAFKEWKEWUKCLAVLQK 


892 


2242 


A 


7723 


2 


1650 


SAPTAPARPCRAERGSGGGMLALLAASVALA 

VAAGAQDSPAPGSRFVCTALPPEAVHAGCPL 

PAMPMQGG AQSPEEELRAAVLQLKETV VQQ 

KETLASARA1RELTGKLARCEGLAGGKARGA 

GATGKDTMGDLPRDPGHWEQLSRSLQTLK 

DRLESLEPLPAMPMQGGAQSPEEELRAAVLQ 

LRETWQQKETLASARAKELTGKLARCEGL 

AGGKARGAGATGKDTMGDLPRDPGHWEQ 

LSRSLQTLKDRLESLEHQLRANVSNAGLPGD 

FREVLQQRLGELERQLLRKGAELEDEKSLLH 

NETSAHRQKTESTLNALLQRVTELERGNSAF 

KSPNAFKVSLPLRTNYLYGKUCKTLPELYAFT 

ICLWLRSSASPGMGTPFSYAVPGQANEIVLIE 

WGNNPIELUNDKVAQLPIJFVSDGKWHHICV 

TWTTRDGMWEAFQDGKXLGTGENLAPWHPI 

KPGGVLILGQEQDTVGGRFDATQAFV GELSQ 

FNIWDRVLRAQEIVN1 ANC STNMPGN1IP WVD 

NNVDVFGGASKWPVETCEERLLDL 


893 


2243 


A 


7729 


3554 


2419 


LTAGTAMNYPLTLEMDLENLEDLFWELDKL 

DNYNDTSLVENHLCPATEGPLMASFKAVFVP 

VAYSLIFLLGVIGNVLVLV1LERHRQTRSSTET 

FLFHLAVADLLLVFILPFAVAEGSVGWVLGTF 

LCKTVIALIIKVNFYCSSLLLACIAVDRYLAIV 

HAVHAYRHRKLLS1HITCGTIWLVGFLLALPEI 

LFAKVSQGHHNNSLPRCTFSQENQAETHAWF 

TSRFLYHVAGFLLPMLVMGWCYVGWHRLR 

QAQRRPQRQKAVRVAILVTSIFFLCWSPYHTV 

TFLDTLARLKAVDNTCKLNGSLPVAITMCEFL 

GLAI ICCLNPMLYTFAG VKPRbDLoKLL 1 ivw 

CTGPASLCOLFPSWRRSSLSESENATSLTTF 


894 


2244 


A 


7738 


670 


287 


FVTRAGRWGAGARVRGGAGGMASGAARWL 

VLAPVRSGALRSGPSLRKDGDVSAAWSGSGR 

SLVPSRSVIVTRSGAILPKPVKMSFGLLRVFSI 

VIPFLYVGTLISKNFAALLEEIIDIFVPEDDDDD 

D 


895 


2245 


A 


7753 


119 


278 


APYAHSQVHCLDKVCGLLPFLNPEVPDQFYR 
LWLSLFLHAGKEAPHCPRTRPL 


896 


2246 


A 


7754 


1 


372 


SPAWWNSQQRWSPFLALLTLEPTFHHLLPIM 
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possible nucleotide deletion, possible 
nucleotide insertion 














QVSTAALAJVUJirM^ 

ACCFSYTSRQIPQNFlADYFETSSQCskpSVIFL 

TKRGRQVCADPSEEWVQKYVSDLELSA 


897 


2247 


A 


7761 


1725 


445 


RPRRRGTHHFSC VLG SFR VS AMFPR V STFLPL 

RPLSRHPLSSGSPETSAAA1MLLTVRHGTVRY 

RSSALLARTKNNIQRYFGTNSVICSKKDKQSV 

RTEETSKETSESQDSEKENTKKDLLGIDCGMK 

VELSTVNVRTTKPPKRRPLKSLEATLGRLRRA 

TEYAPKKRIEPLSPELVAAASAVADSLPFDKQ 

TTKSELLSQLQQHEEESRAQRD AKRPK1 SFSNI 

ISDMKVARSATARVRSRPELRIQFDEGYDNYP 

GQEKTDDLKKRKNIFTGKRLNIFDMMAVTKE 

APETDTSPSLWDVEFAK.QLAJVNEQPLQNGF 

EELIQWTKEGKLWEFPINNEAGFDDDGSEFH 

EHIFLEKfflJBSFPKQGPlRHFMELVTCGLSKNP 

YLSVKQKVEHEWFRNYFNEKKDILKESNIQF 

KLRPWKFLFRNN 


898 


2248 


A 


7775 1 


85 


496 


SCQTTQPPAQSCSTGTMRIMLLFTA1LAFSLA 
QSFGAVCKEPQEEWPGGGRSKRDPDLYQLL 
QRLFKSHSSLEGLLKALSQASTDPKESTSPEK 
RDMHDFFVGLMGKRSVQPDSPTDVNQENVP 
SFGHKYPPRAE 


899 


2249 


A 


7785 


179 


703 


PFHLGASSNTFRLQVQTQESKAQKEVKMGFI 

FSKSMNESMKNQKEFMLMNAKLQLERQLIM 

QSEMRERQMAMQIAWSREFLKYFGTFFGLA 

AISLTAGAIKKKKPAFLVPIVPLSFILTYQYDL 

GYGTLLERMKGEAEDILETEKSKLQLPRGMTT 

FESIEKARKEQSRFFIDK 


900 


2250 


A 


7789 


1465 


300 


VWLPLKSYKIRSPSLHCQCEIFREEFLFS SLQE 

GRDKDTFSKMAMVSEFLKQAWFIENEEQEY 

VQTVKSSKGGPGSAVSPYTONPSSDVAALH 

KA1MVKGVDEATODILTKRNNAQRQQIKAAY 

LQETGKPLDETLKKALTGHLEEWLALLKTP 

AQFDADELRAAMKGLGTDEDTLIEILASRTN 

KEIRDINRVYREELKRDLAKDITSDTSGDFRN 

ALLSLAKGDRSEDFGVNEDLADSDARALYEA 

GERRKGTDVNVFNTILTTRSYPQLRRVFQKY 

TKYSKHDMNKVLDLELKGDIEKCLTATVKCA 

TSKPAFFAEKLHQAMKGVGTRHKALIRIMVS 

RSEIDMNDKAFYQKMYGISLCQAILDETKGD 

YEKILVALCGGN 


901 


2251 


A 


7796 


2 


807 


VEFHPQRARAGARAPSMGVLLTQRTLLSLVL 

ALLFPSMASMAAIGSCSKEYRVLLGQLQKQT 

DLMQDTSRLLDPYIRIQGLDVPKLREHCRERP 

GAFPSEETLRGLGRRCRQTLNAUGCVLHRL 

ADLEQRLPKAQDLERSGLNIEDLEKLQMARP 

NILGLRNNIYCMAQLLDNSDTAEPTKAGRGA 

SQPPTPTPASDAFQRKLEGCRFLHGYHRFMH 

SVGRVFSKWGESPNRSRRHSPHQALRKGVRR 

TRPSRKGKRLMTRGQLPR 


902 


2252 


A 


7802 


2 


721 


TAARRRQKGTAARRLQKGTAARRRQKGTAA 
RRRQKGTAARRPQKGTAARRRQKGTAARRR 
QKGTAARRRQKGTAARRPQKGTAARRRQKG 
TAARRRQK.GTAARRRQKGLA1ASRGCPCASR 
AGGVRGAGSRLRAMAPKVFRQYWDIPDGTD 
CHRKAYSTTSIASVAGLTAAAYRVTLNPPGTF 
LEGVAKVGQYTFTAAAVGAVFGLTTC1SAHV 
REKPDDPLNYFLGGCAGGLTLGARTHNYGIG 
AAACVYFG1AASLVKMGRLEGWEVFAKPKV 
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INPhenylalanine, G-Glycine, H-Histidine, 
Msoleucine, K=Lysme, Leucine, 
M=Methionine, N»Asparagine, P^Proline, 
Q=GIutamine, R=Arginine, S-Serine, 
T=Threonine, V^Valine, W=Tryptophan, 
Y=Tyrosine, X=Unknown. *=Stop codon, 
/=possible nucleotide deletion, ^possible 
nucleotide insertion 


903 


2253 


A 


7807 


1 


584 


PWLPWSDGRAARS SRKCPRSRFPVQVGKMA 

VSTVFSTSSLMLALSRHSLLSPULSVTSFRRFY 

RGDSPTDSQKDMIEIPLPPWQERTDESIETKR 

ARliYESRKRGMLENOLLSLFAKEHLQHMT 

EKQLNLYDRLINEP SND WD1YYW ATE AKPAP 

EIFENEVMA1LRDFAKNKNKEQRLRAPDLEY 

LFEKPR 


904 


2254 


A 


7813 


40 


821 


GAGRALGHLETGAGDVAAALPARKFPRSLLG 

AGARLTGWTMNVFRBLGDLSHLLAMILLLGK 

1WRSKCCKGISGKSQILFALVFTTRYLDLFTNF 

1SIYNTVMKVVFLLCAYVTVYMIYGKFRKTF 

DSENDTFRLEFLLVPVIGLSFLENYSFTLLEIL 

WTFSIYLESVAILPQLFMlSKTGEAETiri'HYL 

FFLGLYRALYLANW1RRYQTENFYDQIAWS 

GWQTIFYCDFFYLYVTKGRSWDDSNADTGL 

RSYSSI 


905 


2255 


A 


7817 


1399 


881 


LSNKDVLSPQLKDENSKLRRKLNEVQSFSEA 

QtEMWTLERKLEAKMIKEESDYHDLESVVQ 

QVEQNLELMTKRAVKAENHWKLKQEISLL 

QAQVSNFQRENEALRCGQGASLTWKQNAD 

VALQNLRWMNSAQASIEQLVSGAETLNLVA 

EILKSIDRISEVKDEEEDS 


906 


2256 


A 


7822 


3 


1462 


DSPRNRFEILGRPTRTPTRPGPRPAMEDLDAL 

LSDLETTTS HMPRSGAPKERPAEPLTPPPSYG 

HQPQTGSGESSGASGDKDHLYSTVCKPRSPK 

PAAPAAPPFSSSSGVLGTGLCELDRLLQELNA 

TQFNITDEIMSQFPSSKVASGEQKEDQSEDKK 

RPSLPSSPSPGLPKASATSATLELDRLMASLSD 

FRVQNHLPASGPTQPPWSSTNEGSPSPPEPTG 

KGSLDTMLGLLQSDLSRRGVPTQAKGLCGSC 

NKPIAGQVVTALGRAWHPEHFVCGGCSTAL 

GGSSFFEKDGAPFCPECYFERFSPRCGFCNQPI 

RHKMWALGTHWHPEHFCCVSCGEPFGDEG 

FHEREORPYCRRDFLQLFAPRCQGCQGPILDN 

YISALSALWHPDCFVCRECFAPFSGGSFFEHE 

GRPLCENHFHARRGSLCATCGLPVTGRCVSA 

LGRRFHPDHFTCTFCLRPLTKGSFQERAGKPY 

CQPCFLKLFG 


907 


2257 


A 


7828 


1792 


1671 


FIYVNQSFAPSPDQEVGTLYECFGSDGKLVLH 
YCKSQAWG 


908 


2258 


A 


7842 


110 


1172 


KLSCPCSHGTRVTAVRGPRLBCAGVQWHDLG 

SLQPPPSGLKQSSHLSLSSSWDFRHAPTHPET 

YTCPBCMTEMEQAEAQLAELDLLASMFPGENE 

LIVNDQLAVAELKDCIEKKTMEGRSSKVYFT1 

NMNLDVSDEKMAMFSLACILPFKYPAVLPEI 

TVRSVLLSRSQQTQLNTDLTAFLQKHCHGDV 

CILNATEWVREHASGYVSRDTSSSFTTGSTVQ 

SVDLIFTRLMYSHHIYNKCKRKNILEWAKEL 

SLSGFSMPGKPGWCVEGPQSACEEFWARLR 

KLNWKRILIRIIREDIPFDGTNDETERQRKFSIF 

EEKVFSVNGARGNHMDFGQLYQFLNTKGCG 

DVFQMFLWV 


909 


2259 


A 


7870 


3067 


2923 


EGICVYTFIYVHMYTRTCMHTYPYMYMNSV 
LISSEILLIPSKYLFESK 


910 


2260 


A 


7884 


212 


4874 


GALTWSHPLLAVCPQGVWLGSTPSGSPALLP 
PSHRVNAEPGCWTNACASGPCPPHANCRDL 
WQTFSCTCQPGYYGPGCVDACLLNPCQNQG 
SCRHLPGAPHGYTCDCVGGYFGHHCEHRMD 
QQCTRGWWGSPTCGPCNCDVHKGFDPNCNK 
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nucleotide insertion 














TNGOCHCK^FJSYT^RQSPSCLPC^CYPVnST 

SRSCAPHSGQCPCRPGALGRQCNSCDSPFAEV 

TASGCRVLYDACPKSLRSGVWWPQTKFGVL 

ATVPCPRGALGLRGAGAAVRLCDEAQGWLE 

PDLFNCTSPAFRELSLLLDGLELNKTALDTME 

AKKLAQRJLREVTGHTDHYFSQDVRVTARLL 

AHLLAFESHQQGFGLTATQDAHFNENLLWA 

GSAIXAPETGDLWAALGQRAPGGSPGSAGLV 

RHLEEYAATLARNMELTYLNPMGLVTPNIML 

SIDRMEHPSSPRGARRYPRYHSNLFRGQDAW 

DPHTHVLLPSQSPRPSPSEVLPTSSSIENSrrTSS 

WPPPAPPEPEPGISIIILLVYRTLGGLLPAQFQ 

AERRGARLPQNPVMNSPVVSVAVFHGRNFLR 

GDLESPISLEFRLLQTANRSKAICV Q WDPPGLA 

EQHGVWTARDCELVHRNGSHARCRCSRTGT 

FGVLMDASPRERLEGDLELLAVFTHVWAVS 

VAALVLTAAILLSLRSLKSNVRGIHANVAAA 

LGVAET.LaLGrHRTHNQI.VCTAWILLHYFF 

LSTFAWLFVQGLHLYRMQVEPRNVDRGAMR 

FYHALGWGVPAVLLGLAVGLDPEGYGNPDF 

CWISVHEPLIWSFAGPWLVIVMNGTMFLLA 

ARTSCSTGQREAKKTSALTLRSSFLIXLLVSA 

SWLFGLLAVNHSILAFHYIHAGLCGLQGLAV 

LLLFCVLNADARAAWMPACLGRKAAPEEAR 

PAPGLGPGAYNNTALFEESGLmiTLGASTVSS 

VSSARSGRTQDQDSQRGRSYLRDNVLVRHGS 

AADHTDHSLQAHAGPTDLDVAMFHRDAGA 

DSDSDSDLSLEEERSLSIPS SESEDNGRTRGRF 

QRPLCRAAQSERLLTHPKDVDGNDLLSYWPA 

LGECEAAPCALQTWGSERRLGLDTSKDAAN 

NNQPDPALTSGDETSLGRAQRQRKGILKNRL 

QYPLVPQTRGAPELSWCRAATLGHRAVPAAS 

YGRIYAGGGTGSLSQPASRYSSREQLDLLLRR 

QLSRERLEEAPAPVLRPLSRPGSQECMDAAPG 

RLEPKDRGSTLPRRQPPRDYPGAMAGRFGSR 

DALDLGAPREWLSTLPPPRRTRDLDPQPPPLP 

LSPQRQLSRDPLLPSRPLDSLSRSSNSREQLDQ 

VPSRHPSREALGPLPQJLLRAREDSVSGPSHGP 

STEQLDILSSILASFNSSALSSVQSSSTPLGPHT 

TATPSATASVLGPSTPRSATSHSTSELSPDSEPR 

DTQALLSATQAMDLRRRDYHMERPLLNQEH 

LEELGRWGSAPRTHQWRTWLQCSRARAYAL 

LLQHLPVLVWLPRYPVRDWLLGDLLSGLSVA 

IMQLPQGLAYALLAGLPPVFGLYSSFYPVHY 

FLFGTSRHISVESLCVPGPVDT 


911 


2261 


A 


7890 


21 


806 


EFGTSRSSRSMAEDLGLSFGETAS VEMLPEHG 

SCRPKARSSSARWALTCCLVLLPFLAGLTTYL 

LVSQLRAQGEACVQFQALKGQEFAPSHQQV 

YAPLRADGDKPRAHLTWRQTPTQHFKNQFP 

ALHWEHbLGLAFntJlRMW 

YFIYSQVTFRGMTSECSEIRQAGRPNKPDSITV 

VITKVTDSYPEPTQLLMGTKSVCEVGSNWFQ 

PIYL GAMF SLQEGDKLM VNVSDISL VD YTKE 

DKTFFGAFLL 


912 


2262 


A 


7891 


U63 


ill 


acgirhegalpgltAtpeAMlrfl^dLAfsfl^ 
lhalgqavqfqeyvflqflgldkapspqkfq 
pvpyilkkifqdreaaattgvsrdlcyvkelg 
vrgnvlrflpdqgfflypkkisqassclqkll 

YFNLSADCEREQLTLAQLGLDLGPNSYYNLGP 
ELELALFL VQEPHV WGQTTPKPGKMFVLRS V 



272 



WO 01/57188 



PCT/US01/03800 



SEQ ID 
NO: of 
nucl- 
eotide 
seq- 
uence 


SEQ ID 
NO: of 
peptide 
seq- 
uence 


Met 
hod 


SEQ 
ID NO: 
in 

USSN 
09/496 
914 


Predicted 

beginning 

nucleotide 

location 

correspond! 

ng to first 

amino acid 

residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue 
of peptide 
sequence 


Amino acid sequence (A==Alanine OCysteine, 
D=Aspartic Acid, E=01utamic Acid, 
F=Phenylalanine, G=Glycine, H=Histidine, 
Hfeoleucine, K-Lysine, L=Leucine, 
M«Methicnine, N-Asparagine, P=Proline, 
Q=Glutamine, R=Arginine, S=Serine, 
T=Threonine, V=Valine, W»Tryptophan, 
Y=Tyrosine, XMJnknown, *=Stop codon, 
/^possible nucleotide deletion, \=possib!e 
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PWPQQAVHFhn^LDVAKDWNDNPRKNFGLFL 

EILVKEDRDSGVNFQPEDTCARLRCSLHASLL 

VVTLNPDQCHPSRKRKAAlPVHKI.SCKJvlLCH 

RHQLFINFRDIJ3WHKWIIAPKGFMANYCHGE 

CPFSLTISLNSSNYAFMQALMHAVDPEDPQAV 

CIFW.SPIS^YQDr^NVILRHYEDNfVVD 

ECGCG 


913 


2263 


A 


7892 


15 


849 


ASRLPRGPGCGADMRPLLGLLLVFAGCTFAL 

YLLSTRLPRGRRLGSTEEAGGRSLWFPSDLAE 

LRELSEVLREYRKEHQAYVFLLFCGAYLYKQ 

GFAIPGSSFLNVLAGALFGPWLGLLLCCVLTS 

VGATCCYLLS SIFGKQL WSYFPDKV ALLQR 

KWJENRNSLFFFIJJTJUiTMTPNWFI^LSA^ 

LNIPIVQFFFSVLIGLIPYNFICVQTGSILSTLTS 

LD ALFSW DT VFKLLA1AMV ALIPGTLIKKFSQ 

KHLQLNETSTANHIHSRKDT 


914 


2264 


A 


7893 


815 


959 


KSGWVWWLTPLIPALWEAQTEGSLRPEVKN 
RLSNITRPFFSKKKKILV 


915 


2265 | 


A 


7909 


3 


641 ; 


HASGPGGLLRRRRGSGANMPVARSWVCRKT 

YVTPRRPFEKSRLDQELKLIGEYGLRNKREV 

WRVKFTLAKJRKAARELLTLDEKDPRRLFEG 

NALLRRLVRIGVLDEGKMKLDYILGLKDEDFL 

ERRLQTQVFKLGLAKSIHHAHVLIQQCH1RVR 

EQVVNILFFTVRLDSQKHIDFSLCFPIGVANPS 

HVKRKNASKGQGGAGARDDEEEE 


916 


2266 


A 


7914 


3 


967 


VAHTQWHTCQRLSQLTHRSILKYLLIDTHAC 

QVLILKHTHASLSLPSCQECFPSSIPSASHMVS 

HPHPPPSPRWGQTPEGLPAASPCGPGPRSCFS 

SILrTGDSWGMLACLCTVLWHLPAVPALNRT 

GDPGPGPSIQKTYDLTRYLEHQLRSLAGTYLN 

YLGPPFNEPDFNPPRLGAETLPRATVDLEVW 

RSLNDKLRLTQNYEAYSHLLCYLRGLNRQAA 

TAELRRSLAHFCTSLQGLLGSIAGVMAALGY 

PLPQPLPGTEPTWTPGPAHSDFLQKMDDFWL 

LKELQTWLWRSAKDFNRLKKKMQPPAAAVT 

LHLGAHGF 


917 


2267 


A 


7921 


2 


1166 


RPRRGQGLVQEVQTENVTVAEGGVAE1TCRL 

HQYDGSIWIQNPARQTLFFNGTRALKDERFQ 

LEEFSPRRVR1RLSDARLEDEGGYFCQLYTED 

THHQIATLTVLVAPENPWEVREQAVEGGEV 

ELSCLVPRSRPAATLRWYRDRKBLKGVSSSQ 

ENGKVWSVASTVRFRVDRKDDGGfflCEAQN 

QALPSGHSKQTQYVLDVQYSPTARMASQAV 

VREGDTLVLTCAVTGNPRPNQERWNRGNESL 

PERAEAVGETLTLPGLYSADNGTYTCEASNK 

HGHARALYVLVVYGESRLRPTEGGGGAPDP 

GAWEAQTSVPYAIVGGILALLVFLnCVLVG 

MWCSVRQKGSYLTHEASGLDEQGEAREAF 

LNGSDGHKRKEEFFI 


918 


2268 


A 


7938 


3 


2653 


RRRLPPASPPSSSVSSSLSPSAVVMACRWSTK 

ESPRWRSALLLLFLAGVYGNGALAEHSENVH 

ISGVSTACGETPEQIRAPSGIITSPGWPSEYPAK 

INCSWFIRANPGEIITISFQDFDIQGSRRCNLD 

WLTIETYKNIESYRACGSTIPPPYISSQDHIWIR 

FHSDDNISRKGFRLAYFSGKSEEPNCACDQFR 

CGNGKC1PEAWKCNNMDECGDRSDEEICAKE 

ANPPTAAAFQPCAYNQFQCLSRFTKVYTCLP 

ESLKCDGNIDCLDLGDE1DCDVPTCGQWLKY 

FYGtFNSPNYPDFYPPGSNCTWLIDTGDHRK 
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VILRFTDFKIJ)QTQyGDYVJCIY,DOIJEENPHK 1 

LLRVLTAFDSHAPLTWSSSGQIRVHFCADKV 

NAARGFNATYQVDGFCLPWBIPCGGNWGCY 

TEQQRCDGYWHCPNGRDETNCTMCQKEEFP 

CSRNGVCYPRSDRCNY QNHCPNGSDEKNCFF 

CQPGNFHCXNNRCVFESWVCDSQDDCGDGS 

DEENCPVIVPTRVITAAVIGSLICGLLLVIALG 

CTCKLYSLRMFERRSFETQLSRVEAELLRREA 

PPSYGQUAQGLIPPVEDFPVCSPNQASVLENL 

RLAVRSQLGFTSVRLPMAGRSSNIWNRIFNFA 

RSRHSGSLALVSADGDEWPSQSTSREPERNH 

THRSLFSVESDDTDTENERRDMAGASGGVAA 

PLPQKVPPTTAVEATVGACASSSTQSTRGGH 

ADNGRDVTSVEPPSVSPARHQLTSALSRMTQ 

GLRWVRFTLGRSSSLSQNQSPLRQLDNGVSG 

REDDDDVEMLIPISDGSSDFDVNDCSRPLLDL 

ASDQGQGLRQPYNATNPGVRPSNRDGPCERC 

GIVHTAQIPDTCLEVTLKNETSDDE Af .1 ,T ,C 


919 


2269 


A 

•TV 


/SOI 


lb/4 


1839 


VVRVTCCPPARSTTERTNAYDEEDCVEMVAS 
GGWNDVACHTTMYFMCEFDKKNM 


920 


2270 


A 


7953 


47 


572 


GGRASWPEQAKEPRREGHTDKQQTEDVLAA 

GLRGLPHLPAICARRMSPAFRAMDVEPRAKG 

VLLEPFVHQVGGHSCVLRFNETTLCKPLVPRE 

HQFYETLPAEMRKFTPQYKGKSQLLEGLPHW 

RGDVRDRGHGRPWQPSLEPSLPPTLCFPSLSS 

FSSSWPSAQHLTPSVFNPW 


921 


2271 


A 


7957 


612 


812 


RSGRTWTGIGYSKALQSSNRNTKSLLQNEF " 

MMVYSFRALSFKESTWATFQHGGEATKSRSL 

SSTQ 


922 


2272 


A 


7967 


1443 


1660 


ENITEKWKEIWMCRGNKKSCCWTFIKDRHLT 
VSCCKSKSGETLLICIFCSNLVGFFFFGIRGFSN 
WELVKPN 


923 


2273 


A 


7981 


1 


3023 


GSAPRAATAMARARPPPPPSPPPGLLPLLPPLL 

LLPLLLLPAGCRALEETLMDTKWVTSELAWT 

SHPESGWEEVSGYDEAMNP1RTYQVCNVRES 

SQNNWLRTGFIWRRDVQRVYVELKFTVRDC 

NSIPNIPGSCKETFNLFYYEADSDVASASSPFW 

MENPYVKVDTIAPDESFSRLDAGRVNTKVRS 

FGPLSKAGFYLAFQDQGACMSLISVRAFYKK 

CASTTAGFALFPETLTGAEPTSLVIAPGTCIPN 

AVEVSVPLKLYCNGDGEWMVPVGACTCATG 

HEPAAKESQCRPCPPGSYKAKQGEGPCLPCPP 

NSRTTSPAASICTCHNNFYRADSDSADSACTT 

VPSPPRGVISNVNETSULEWSEPRDLOVRDD 

LLYNVICKKCHGAGGASACSRCDDNVEFVPR 

QLGLSEPRVHTSHLLAHTRYTFEVQAVNGVS 

GKSPLPPRYAAVNITTNQAAPSEVPTLRLHSS 

SGSSLTLSWAPPERPNGVILDYEMKYFEKSEG 
lASTVTSOMTsJSVOr DHr BPriA J? vwnvp adt 

VAGYGQYSRPAEFETTSERGSGAQQLQEQLP 

LIVGSATAGLVFWAVWIAIVCLRKQRHGS 

DSEYTEKLQQYIAPGMKVYIDPFTYEDPNEA 

VREFAKEIDVSCVK1EEVIGAGEFGEVCRGRL 

KQPGRRE\HFVAIKTLKVGYTERQRRDFLSEA 

SIMGQFDIIPNIIRLEGVVTKSRPVMILTEFME 

NCALDSF1RLNDGQFTVIQLVGMLRGIAAGM 

KYLSEMNYVHRDLAARNILVNSNLVCKVSDF 

GLSRFLEDDPSDPTYTSSLGGKIPIRWTAPEAI 

AYRKFTSASDVWSYGIVMWEVMSYGERPY 
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Mso leucine, KHLysine, L^Leucine, 
M=Methionine, N=Asparagine, P=Protine, 
Q=Glutamme, R«Arginine, S=Serine, 
T=Threonine, V= Valine, W=Tryptophan, 
Y=Tyrosine,X<Jnknown,*==Stopcodon, i 
/possible nucleotide deletion, V=possible | 
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WDMSNQD VIN A VKQD YRLPPPMDCPTALH Q 

LMIJ)CWVRDRNLRPKFSQIVKrLDKLKNAA 

SUCVIASAQSGMSQPLLDRTVPDYTTFTTVGD 

WLDAIKMGRYKESFVSAGFASFDLVAQMTA 

EDLLRIGVTLAGHQKK1LSSIQDMRLQMNQT 

LPVQV 


924 


2274 


A 


7985 


1 


503 


FRPRTKKATAMYLEHYLDSIENLPCELQRNF 

QLMRELDQRTEDKKAEIDILAAEYISTVKTLS 

PDQRVERLQKIQNAYSKCKEYSDDKVQLAM 

QTYE\m>KHIRRLDADLARFEADLKDKMEG 

SDFESSGGRGLKKGRGQKEKRGSRGRGRRTS 

EEDTPKKKKHKGG 


925 


2275 


A 


7994 


447 


589 


LPCSFCAQCMSSFERVWLQQSHFHNPRWNSR 
SPIRCYCQHWPHCVHC 


926 


2276 


A 


7996 


925 


582 


GPCKVCCITL AIMLQCH SFYRKD VQ VEHPKS 
LNPKYSQIENFLSADMALKRKCLLSISDLDFW 
IWDAQPVG1MQTLQNLKKIPNPGCFWSQAFQI 
KDTQP1LPLGGRYYIT1RQ 


927 


2277 


A 


7^8 


2 


353 


klQRPLNSRSPNHSLlFVkAELTAKQATMKLSV 
CLLLVTLALCCYQANAEFCPALVSELLDFFF1 
SEPLFKLSLAKFDAPPEAVAAKLGVKRCTDQ 
MSLQKRSL1AEVLVKILKKCSV 


928 


2278 


A 


8004 


130 


588 


LAPLRCQPGTRTQPRSHPAANDPSAAMSAAG 
ARGLRATYHRLLDKVELMLPEKLRPLY'NHPA 
GPRTVFFWAPIMKWGLVCAGLADMARPAEK 
LSTAQSAVLMATGFIWSRYSLVIIPKNWSLFA 
VNFFVGAAGASQLFRIWRYNQELKAKAHK 


929 


2279 


A 


8007 


2 


1016 


EFARRRVFIAAREMSLLRSLRVFLVARTGSYP 

AGSLLRQSPQPRHTFYAGPRLSASASSKELLM 

KLRRKTGYSFVNCKKALETCGGDLKQAEIWL 

HKEAQKEGWSKAAKLQGRKTKEGLIGLLQE 

GNTTVLVEVNCETDFVSRNLKFQLLVQQVAL 

GTMMHCQTLKDQPSAYSKGFLNSSELSGLPA 

GPDREGSLKDQLALMGKLGENMILKRAAWV 

KVPSGFYVGSYVHGAMQSPSLHKLVLGKYG 

ALVICETSEQKTNLEDVGRRLGQHWGMAPL 

SVGSLDDEPGGEAETKMLSQPYLLDPS1TLGQ 

YVQPQGVSWDFVRFEOGEGEEAAETE 


930 


2280 


A 


8008 


3 


1679 


NSRVWGP WTEPSAG SLRPMARKQNRNSKEL 

GLVPLTDDTSHAGPPGPGRALLECDHLRSGV 

PGGRRRKDWSCSLLVASLAGAFGSSFLYGYN 

LSVVNAFTPYIKAFYNESWERRHGRPIDPDTL 

TLLWSVTTSIFAIGGLVGTUVKMIGKVLGRK 

HTLLANNGFAISAALLMACSLQAGAFEMLIV 

GRHMGIDGGVAI^VLPMYLSEISPKEIRGSLG 

QVTAIFtCIGVFTGQLLGLPELLGKESTWPYLF 

GVIVVPAVVQLLSLPFLPDSPRYLLLEKHNEA 

RAVKAFQTFLGKADVSQEVEEVLAESRVQRS 

IRLVSVLELLRAPYVRWQVVTVTVTMACYQL 

CGLNAIWFYTNSFGKAGIPPAKIPYVTLSTGG 

IETLAAVFSGLVIEHLGRRPLLIGGFGLMGLFF 

GTLTITLTLQDHAPWVPYLSIVGILAIIASFCSG 

PGGIPFELTGEFFQQSQRPAAFnAGTVNWLSN 

FAVGLLFPFIQKSLDTYCFLVFATICITGAIYL , 

YFVLPETKNRTYAEISQAFSKRNKAYPPEEKI 

DSAVTDGKINGRP 


931 


2281 


A 


8009 


861 


300 


AAGAWSAMPKAKGKTRRQKFGYSVNRKRL 
NRNARRKAAPRIECSHIRHAWDHAXSVRQNL 
AEMGLAVDPNRAVPLRKRKVKAMEVDIEER 
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PKELVRJO > YVLNDLEABA^U?EKKQNTLS?^ 
LIDYVRYMVENHGEDYKAMARDEKNYYQD 
TPKQIRSKINV YKRF YPAEWQDFLD SLQKRK 
MEVE 


932 


2282 


A 


.8011 


412 


1 


SNLCLGNSWRWRWAKSRHHC1PTVTLSKRSG 

DIRGSHFSSPQRQRSQRVPGKETARVLRAGK 

QGRGQ1PIPCPWPPPPPPPPPGSPGPGCRQFHQ 

SLBAKARHPASVREMRGKVKMRRALRRAPA 

STRASSRQPNPK 


933 


2283 


A 


8012 


147 


1077 


PPVPPASRSDMAQNLKDLAGRLPAGPRGMGT 

ALKLLLGAGAVAYGVRESVFTVEGGHRAIFF 

NRIGGVQQDTILAEGLHFRIPWFQYPHYDIRA 

RPRKISSPTGSKDLQMVNISLRVLSRPNAQEL 

PSMYQRLGLDYEERVLPSIVNEVLKSWAKF 

NASQLITQRAQVSLURRELTERAKDFSLILDD 

VAITELSFSREYTAAVEAKQVAQQEAQRAQF 

LVEKAKQEQRQKIVQAEGEAEAAKMLGEAL 

SKNPGY1KLRK1RAAQN1SKT1ATSQNRIYLTA 

DNLVLNLQDESFTRGSDSLDCGKK 


934 


2284 


A 






982 


S QFSLSQVL VDSAEEGSLAAAAELAAQKREQ 

RLRKFRELHLMRNEARKLNHQEWEEDKRL 

KLPANWEAKfCARLEWELKEEEKKKECAARG 

EDYEKVK1XEISAEDAERWERKJCKRKNPDLG 

FSDYAAAQLRQYHRLTKQIKPDMETYERLRE 

KHGEEFFPTSNSLLHGTHVPSTEEIDRMVIDLE 

KQIEKRDKYSRRRPYNDDADDYINERNAKF 

NKKAERFYGKYTAEIKQNLERGTAV 


935 


2285 


A 


8027 


59 


310 


LVSSTVNLLTEKAPWNSLAWTVTSYVFLKFL 
QGGGTGSTGMRDSALTLLGIGPSHRHSLSIRL 
SQHSSPAPMYSQTFHILVLG 


936 
937 


~2286 
"2287 


A 
A 


8032 
8039 


1 

393 


639 
311 


SGRECNMAKTYDYLFKLLLIGDSGVGKTCVL 

FRFSEDAFNSTF1STIGIDFKIRTIELDGKRIKLQ 

IWDTAGQERFRTITTA>TRGAMGIML\^YDIT 

NEKSFDN1RNWIRNIEEHASADVEKMILGNKC 

DVNDKRQVSKERGEKLALDYGIKFMETSAK 

AN1NVENAFFTLARD1KAKMDKKLEGNSPQG 

SNQG VKITPDQQKRS SFFRC VLL 


938 


2288 


A 


8052 


675 


1334 


EETIHSENSYILEKYIPISANLTLTIA 

LHPAATSTAWLHVPPGLSMALSWVLTVLSLL 

PLLEAQIPLCANLVPVPITNATLDRrrGKWFYI 

ASAFRNEEYNKSVQEIQAITFYFTPNKTEDTIF 

LREYQTRQDQCIYNTTYLNVQRENGTISRYV 

GGQEHFAHLLILRDTKTYMLAFDVNDEKNW 

GLSVYADKPETTKEQLGEFYEALDCLRIPKSD 

VVYTDWKKDKCEPLEKQHEKERKQEEGES 


939 


2289 


A 


8055 


12 


1039 


SSVAEFPERVQLSQPQNWNFSGAGGAWSLDF 
AEQLKWSAELARLGESIMDGKQGGMDGSKP 
AGPRDPPGJRLLSNPLMG0AVSDWSPMHEAA 
IHGHQLSLRNLISOGWAVNIITADHVSPT VCPA 

CLGGHLSCVKILLKHGAQVNGVTADWHTPL 

FNACVSGSWDCVNLLLQHGASVQPESDLASP 

1HEAARRGHVECVNSLIAYGGNIDHKJSHLGT 

PLYLACENQQRACVKKLLESGADVNQGKGQ 

DSPLHAVARTASEELACLLMDFGADTQAKN 

AEGKRPVELVPPESPLAQLFLEREGPPSLMQL 

CRLRIRKCFGIQQHHKITKLVLPEDLKQFLLH 


940 


2290 


A 


8058 


2 


1203 


KVLS1REPAHSTARKASEPSQPSQPSQPGGHLI 
ARLRTMDLHLFDYSEPGNFSDISWPCNSSDG 
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VVDTVMCPNMPNKSVLXYTLSFIYIFIFV1GMI 

ANSVWWVNIQAKTTGYDTHCYILNLAIADL 

WVVLTIPVWVVSLVQHNQWPMGELTCKVTH 

IJPSINLFGSIFFLTCMSVDRYLSnTFTNTPSS 

RKKMVRRWCILVWLLAFCVSLPDTYYLKT 

VTSASNNETYCRSFYPEHSIKEWLIGMELVSV 

VLGF A VPFSIIA VFYFLLARA1S AS SDQEKHS S 

RKIIFSYWVFLVCWLPYHVAVLLDIFSILHY1 

PFTCRLBHALFTALHVTQCLSLVHCCVNPVL 

YSFINRNYRYELMKAFCFKYSAKTGLTKLIDA 

SRVSETEYSALEQSTK 


941 


2291 


A 


8059 


73 


432 


DMAGLNITiVTSLLFLGVCAHHnPTGSVVLPS 
PCCMFFVSKRIPENRWSYQLSSRSTCLKAGV 
IFTTKKGQQFCGDPKQEWVQRYMKNLDAKQ 
KKASPRARAVAVKGPVQRYPGNQTTC 






A 


8067 


278 


1262 


GGIGEIKQRPSCLGRCLDPSLSVLMNISLGLGS 

VFSAV1SQKPSRDICQRGTSLTIQCQVDSQVT 

NIMFWYRQQPGQSLTLlATANQGSEAiTESGF 

V1DKFPISRPNLTFSTLTVSNMSPEDSSIYLCSA 

GRQGTYEQYFGPGTRLTVTEDLKNVFPPEVA 

VFEPSEAEISHTQKATLVCLATGFYPDHVELS 

WWVNGKEVHSGVSTDPQPLKEQPALNDSRY 

CLSSRLRVSATFWQNPRNHFRCQVQFYGLSE 

NDEWTQDRAKPVTQIVSAEAWGRADCGF1S 

ESYQQGVLSATILYEILLGKATLYAVLVSALV 

LMAMVKRKDSRG 


943 


2293 


A 


8070 


1 

i 


879 


MVKVVPATRGNLPRSQLTGTHQHCQPREPKI 

TASERLRRRPRATARLRAJIAAPPEPPLAVFAP 

PSDRKELLALPVACDPVIASVMSWVQAASLI j 

QGPGDKGDVFDEEADESLLAQREWQSNMQR 

RVKEGYRDGIDAGKAVTLQQGFNQGYKKGA 

EVDLNYGRLRGTLSAXLSWCHLHNNNSTLINK 

INNLLDAVGQCEEYVLKHLKSITPPSHVVDLL 

DSIEDMDLCHVVPAEKKIDEAKDERLCENNA 

EFNKNCSKSHSGIDCSYVECCRTQEHAHSGK 

PKPHMDFGTDSQF 


944 


2294 


A 


8073 


1 


797 


ESARWSRQLRRTLIRLSFPISCGRSHAFGGCK 

MAATSGIBEPVSGELVSVAHALSLPAESYGN 

DPDIEMAWAMRAMQHAEVYYKLISSVDPQF 

LKLTKVDDQIYSEFRKNFETLRIDVLDPEELK 

SESAKEKWRPFCLKFNGIVEDFNYGTLLRLD 

CSQGYTEENTIFAPR1QFFAIEIARNREGYNKA 

VYISVQDKJEGEKGVNNGGEKRADSGEEENT 

KNGGEKGADSGEEKEEG1NREDKTDKGGEK 

GKEADKEINKSGEKAM 


945 


2295 


A 


8074 


2 


505 


G AATLLRS AS S AARKAAE AEQ VWLHLHRYL 

SADRRVLGLREWGRPASERECSLCQRLKREL 

NMGDVEKGKKIFIMKCSQCHTVEKGGKHKT 

GPNLHGLFGRKTGQAFGYSYTAANKNKGIIW 

GEDTLMEYLENPKKYIPGTKMIF^GIKKBCEER 

ADLIAYLKKATNE 


946 


2296 


A 


8081 


42 


590 


EGRRGKFGGKLCNFLFYFHSNSAESRMDVLF 
VAIFAVPLILGQEYEDEERLGEDEYYQWYY 
YTVTPSYDDFSADFTIDYSIFESEDRLNRLDK 
DITEAIETT] SLETARADHPKPVTVKPVTTEPQ 
SPRSEAMPCPVLRSPIPLPPVRVPLFRWGCISC 
KKVGRRLUVTTLWMGVWQEEIGR 


947 


2297 


A 


8084 


322 


549 


GGGSSPRELAGAAGLTVTSQAVAARRQQPSF 
SRARAPAHSLRAALSLASSARSWGAVSRDRG 
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PCREAIMYQSSNKC H 


948 


2298 


B 


8093 


3905 


846 


MEPGEVKERILENISLSVKKLQSYFAACEDEI 

PAIRNHDKVLQRLCEHLDHAIXYGLQDLSSG 

YWVLVVHFTRREAIKQIEVLQHVATKLGRSR 

AWLYLALNENSLESYLRLFQENLGLLHKYYV 

KNALVCSHDHLTLFLTLVSGLEFERFELDLDA 

PYIDLAPYMPDYYKPQYLLDFEDRLPSSVHG 

SDSLSLNSFNSVTSTNLEWDDSAIAPSSEDYD 

FGDVFPAVPSVPSTDWEDGDLTDTVSGPRST 

ASDLTSSKASTRSPTQRQNPFNEEPAETVSSS 

DTTPVHTTSQEKEEAQALDPPDACTELEVIRV 

TKKKKIGKKKKSRSDEEASPLHPACSQKKCA 

KQGDGDSKNGSPSLGRDSPDTMLASPQEEGE 

GPSSTTESSERSEPGLLIPEMKDTSMERLGQPL 

SKVIDQLNGQLDPSTWCSRAEPPDQSFRTGSP 

GDAPERPPLCDFSEGLSAPMDFYRFTVESPST 

VTSGGGHHDPAGLGQPLHVPSSPEAAGQEEE 

GGGGEGQTPRPLEDTTREAQELEAQLS1 ,VRE 

GP VSEPEPGTQEVLCQLKRDQPSPCLS SAEDS 

GVDEGQGSPSEMVHSSEFRVDNNHLLLLMIH 

VFRENEEQLFKMIRMSTGHMEGNLQLLYVLL 

TDCYVYLLRKGATEKPYLVEEAVSYNELDY 

VSVGLDQQTVKLVCTNRRKQFLLDTADVAL 

AEFFLASLKSAMIKGCREPPYPSILTDATMEK 

LALAKFVAQESKCEASAVTVRFYGLVHWED 

PTDESLGPTPCHCSPPEGTITKEGMLHYKAGT 

SYLGKEHWKTCFWLSNGILYQYPDRTDVP 

LLSVNMGGEQCGGCRRANTTDRPHAFQVILS 

DPPCXELSAESEAEMAEWMQHLCQAVSKGVI 

PQGVAPSPC1PCCLVLTDDRLFTCHEDCQTSF 

FRSLGTAKLGDISAVSIEPGKEYCVLEFSQDS 

QQLLPPWVCYLSCTSELDRLLSALNSGWKTIY 

QVDLPHTAIQEASNKKKFEDALSLIHSAWQR 

SDSLCRGRASRDPWC* 


949 


2299 


A 


8095 


9 


2374 


ARRADTVLLESPSMLQGLLPVSLLLSVAVSAI 

KELPGVKKYEWYPIRLHPLHKREAKEPEQQ 

EQFETELKYKJVmNGKlAVLYlKK>JKNLLAP 

GYTETYYNSTGKEITTSPQIMDDCYYQGHILN 

EKVSDASISTCRGLRGYFSQGDQRYFffiPLSPI 

HRDGQEHALFKYNPDEKNYDSTCGMDGVL 

WAHDLQQNIALPATKLVKLKDRKVQEHEKY 

DEYYLVLDNGEFKRYNENQDEIRKRVFEMAN 

YVNMLYKKLNTHVALVGMEIWTDKDKDCIT 

PNASFTLENFSKWRGSVLSRRKRHDIAQLITA 

TELAGTTVGLAFMSTMCSPYSVGWQDHSD 

NLLRVAGTMAHEMGHNFGMFHDDYSCKCPS 

TICVMDKALSFYIPTDFSSCSRLSYDKFFEDKL 

SNCLFNAPLPTDIISTP1CGNQLVEMGEDCDC 

GTSEECTNICCDAKTCKJKATFQCALGECCEK 

LQtlUUVuMVCRPAKDECDLPEMCNGKSGNC 

PDDRFQVNGFPCHHGKGHCLMGTCPTLQEQ 

CTELWGPGTEVADKSCYNRNEGGSKYGYCR 

RVDDTLIPCKANDTMCGKLFCQGGSDNLPW 

KGRIVTFLTCKTFDPEDTSQEIGMVANGTKCG 

DNK.VCINAECVDIEKAYKSTNCSSKCKGHAV 

CDHELQCQCEEGWIPPDCDDSSWFHFSrWG 

VLFPN4A\TFVVVAMVIRHQSSREKQKKDQRP 

LSTTGTRPHKQKRKKJMVKAVQPQEMSQMK 

PHVYDLPVEGNEPPASFHKDTNALPPTVFKD 

NPMSTPKDSNPKA 
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950 


2300 


A 


8100 


1 


1251 


MGLXLMIIASAVLGSFLTLLAQTFLLYRRQPE 

PPADEAARAGEGFRYIKPVPGLLLREYLYGG 

GRDEEPSGAAPEGGATP1AAPETPAPPTRETC 

YFLNATILFLFRELRDTALTRRWVTKKIKVEF 

EELLQTKTAGRLLEGLSLRDVFLGETVPFIKTI 

RLVRPWPSATGEPDGPEGEALPAACPEELAF 

EAEVEYNGGFHLAIDVDLVFGKSAYLFVKLS 

RWGRLRLVFTRVPFTHWFFSFVEDPLlDt'EV 

RSQFEGRPMPQLTSnWQLKKIIKRKHTLPNY 

KERFKPFFPYQTLQGFEEDEEHMQQWALTE 

GRLKVTLLECSRLLIFGSYDREANVHCTLELS 

SSVWEEKQRSSKTGTISLTAVFMGWHRVSE 

AFPGT.WYKlJ:.VniJ^'Gl,ErX}GPIJ.TVPLRQ 

CPG 


951 


2301 


A 


810$ 


1612 


839 


EVALFCFEMAAGMYLEHYLDSIENLPFELQR 

NFQLMRDLDQRTEDLKAEIDKIATEYMSSAR 

SLSSEEKLALLKQIQEAYGKCKEFGDDKVQL 

AMQTYEMVDKHIRRLDTDLARFEADLKEKQT 

ESSDYDSSSSKGKKKGRTQKEKKAARARSKG 

KNSDEEAPKXAQKKLKLVRTSPEYGMPSVTF 

GSVHPSDYLDMPVDPNEPTYCLCHQVSYGE 

MIGCDNPDCSIEWFHFACVGLTTKPRGKWFC 

PRCSQERKKK 


952 


2302 


A 


8112 


595 


291 


PSVASLARRFSGRALWPPSHSVPGNRALCPRL 
LHGTTLPGGNQRELARQKNMKKQSDSVKGK 
RRDDGLS AAARKQRD STPRDSELMQQKQKK 
ANEKKEEPK 


953 


2303 


A 


8118 


1 


669 


VCAGIRDPCSTPLAKPAAGGAENLSFGKQPG 

LETNILKMTTPNKTPPGADPKQLERTGTVREI 

GSQAVWSLSSCKPGFGVDQLRDDNLETYWQ 

SDGSQPHLVNIQFRRKTTVKTLCIYADYKSDE 

SYTPSKISVRVGNNEHNLQEIRQLELVEPSGW 

IHVPLTDNHKKPTRTFM1QIA VL ANHQN GRD 

THMRQIKJYTPVEESSIGKJPRCITIDFMMYRS 

IR 


954 


2304 


A 


8133 


66 


1015 


PPLPPRSFPNLFSRPEPLPEPGRRGCNRSREPA 

ARAPSPPPPFEGAPGRAMVKVTFNSALAQKE 

AKKDEPKSGEEAJUIPPDAVAVDCKDPDDVV 

PVGQRRAWCWCMCFGLAFMLAGVILGGAY 

LYKYFALQPDDVYYCGIKYIKDDVILNEPSAD 

APAALYQTIEENIKIFEEEEVEFISVPVPEFADS 

DPANIVHDFNKJ^TAYLDL>aJDKCYVIPLNT 

SIVMPPRNLLELLINIKAGTYLPQSYLIHEHMV 

rroRIENTOHIXjFFrvnRLCHDKETYKLQRIim 

KGIQKREASNCFAIRHFENKFAVETLICS 


955 


2305 


A 


8143 


35 


1171 


VESRSAWHEGEDQIDRLDFERNQMNLLTLDV 

KKK1KEVTEEVANKVSCAMTDEICRLSVLVD 

EFCSEFHPNPDVLKIYKSELNKHIEDGMGRNL 

ADRCTDEVNALVLQTQQEHENLKPLLPAGIQ 

DKUfrUPCIGO^LSYNLNYHKLCSDFQEDIV 

FRFSLGWSSLVHRFLGPRNAQRVLLGLSEPIF 

QLPRSLASTPTAPTTPATPDNASQEELMITLVT 

GLASVTSRTSMGinVGGVTWKTIGWKLLSV& 

LTMYGALYLYERLSWTTHAKERAFKQQFVN 

YATEKLRMTVSSTSANCSHQVKQQIATTFARL 

CQQVDITQKQLEEEIARLPKEIDQLEKIQNNS 

KLLRNKAVQLENELENFTKQFLPSSNEES 


956 


2306 


A 


8157 


1854 


798 


"ASGSPAPSSSSAMAAACGPGAACjYCLLLGLH 
LFLLTAGPALGWNDPDRMLLRDVKALTLHY 
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DRYI ISR^DPJPQLKCVOOTAQCTSYTPKVI > 

QCQNKGWDGYDVQWECKTDLDIAYKFGKT 

WSCEGYESSEDQYVLRGSCGLEYNLDYTEL 

GLQKLKESGKQHGFASFSDYYYKWSSADSC 

NMSGLrriVVLLGIAFVVYKLFLSDGQYSPPP 

YSEYPPFSHRYQRFTNSAGPPPPGFKSEFTGPQ 

NTGHGATSGFGSAFTGQQGYENSGPGFWTGL 

GTGG1LG YLFGSNRAATPFSDS WYYPSYPPS Y 

PGTWNRAYSPLHGGSGSYSVCSNSDTKTRTA 

SGYGGTRRR 


957 




A 


8159 


1492 


528 


THVVMTOMCYAPHQVLSYINGVTTSKPGVSL 

VYSMPSRNLSLRLEGLQEKDSGPYSCSVNVQ 

DKQGKSRGHSIK'l LELN VL VPPAPPSCRLQGV 

PHVGANVTLSCQSPRSKPAVQYQWDRQLPSF 

QTFFAPALDVIRGSLSLTNLSS SMAGVYVCKA 

HNEVGTAQCNVTLEVSTGPGAAWAGAWG 

TLVGLGLLAGLVLLYHRRGKALEEPANDDCE 

DAIAPRHPWPKSSDTISKNGTLSSVTSARAL 

RPPHGPPRPGALTPTPSLSSQALPSPRLPTTDG 

AHPQPISPIPGGVSSSGLSRMGAVPVMVPAQS 

QAGSLV 


958 


2308 


A 


8161 


2340 


1192 


ELARRPKQQSSEKSRNMIRNWLT1FILFPLKLV 

EKCESSVSLTVPPWKLENGSSTNVSLTLRPP 

LNATLVITFEITFRSKNrriLELPDEVVVPPGVT 

NSSFQVTSQNVGQLTVYLHGNHSNQTGPRIR 

FLVIRSSAISIINQVIGW1YFVAWSISFYPQVIM 

NWRRKSVIGLSFDFVALNLTGFVAYSVFNIGL 

LWVPYIKEQFLLKYPNGVNPVNSNDVFFSLH 

AWLTLIUVQCCLYERGGQRVSWPAIGFLVL 

AWLFAFVTMIVAAVGV1TWLQFLFCFSYIKL 

AVTLVKYFPQAYMNFYYKSTEGWSIGNVLL 

DFTGGSFSLLQMFLQSYNNDQWTLIFGDPTK 

FGLGVFSIVFDWFFIQHFCLYRKRPGYDOLN 


959 


2309 


A 


8163 


521 


1345 


GERAGRRRGRLGVWAQPQPLLPRPVGSRRE 

MQPPGPPPAYAPTNGDFTFVSSADAEDLSGSI 

ASPDVKLNLGGDFIKESTATTFLRQRGYGWL 

LEVEDDDPEDNKPLLEELDIDLKDIYYKIRCV 

LMPMPSLGFNRQWRDNPDFWGPLAV\OJFS 

MISLYGQFRWSWimWIFGSLTIFLLARVLG 

GEVAYGQVLGVIGYSLLPUVIAPVLLWOSF 

EWSTUKLFGVFWAAYSAASLLVGEEFKTK 

KPLLIYPIKLLYIYFLSLYTGV 


960 


2310 


A 


8167 


1 


2921 ! 


MTCFKGQKGEQRSHAFEANKDHKAKVPSPN 
LYSQLNALQFTVDERSILWLNQFLLDLKQSL 
NQFMAVYKLNDNSKSDEHVDVRVDGLMLK 
FVIPSEVKSECHQDQPRAISIQSSEMIATNTRH 
CPNCRHSDLEALFQDFKDCDFFSKTYTSFPKS 
CDNFl^LHPIFQRHAHEQDTKMHEIYKGNITP 

QLNKNTLKTSAATDVWAVYFSQFWrDYEGM 
KSGKCRPrRFVD^PPT ^IWirfYPTPVArQnv-nn 

jvuvjivvjiu. Lor v uqi'l jjQi. Wl\AjJr I K. I AhMjis fir 

QTCNQVSLNTSQSESSDLAGRLKRKKLLKEY 

YSTESEPLTNGGQKPSSSDTFFRFSPSSSEADI 

HLLVHVHKHVSMQINHYQYLLLLFLHESLILL 

SENLRKDVEAVTGSPASQTSICIGILLRSAELA 

LLLHPVDQANTLKSPVSESVSPVVPDYLPTEN 

GDFLSSKRKQISRDINRIRSVTVNHMSDNRSM 

SVDLSHIPLKDPLLFKSASDTNLQKGISFMDY 

LSDKHLGKISEDESSGLVYKSGSGEIGSETSD 

KXDSFYTDSSSVLNYREDSNILSFDSDGNQNI 

LSSTLTSKGNETDBSIFKAEDLLPEAASLSENL 
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DISKEETPPVRTLKSQSSLSGKPKERCPPNLAP 

LCVSYKNMKRSSSQMSLDTISLDSMILEEQLL 

ESDGSDSHMFLEKGNKKNSTTNYRGTAESVN 

AGANLQ>TVGETSPDAISTNSEGAQENHDDLM 

SVVVFKTTGVNGEIDIRGEDTEICLQVNQVTP 

IX^LGNISIJIHYLCNRPVGSDQKAVIHSKSSPE 

ISLRFESGPGAV1HSLLAEKNGFLQCHIENFST 

EFLl^SLMNIQHFLEDETVATVMPMKIQVSNT 

KINLKDDSPRSSTVSLEPAPVTVHIDHLWER 

SDDGSFHIRDSHMLNTGNDLKENVKSDSVLL 

TSGKYDLKKQRSVTQATQTSPGVPWPSQSAN 

FPEPSFDFTREQLMEENESLKQELAKAKMAL 

AEAHLEKDALLHHIKKMTVE 


961 


2311 


A 


8172 


1442 


682 


TAAMSIFTPTNQD^TNVAVVRM^AOKRFEI 

ACYKNKWGWRSGVEKDLDEVLQTHSVFVN 

VSKGQYAKKEDLISAFGTDDQTEICKQILTKG 

EVQVSDKERHTQLEQMFRDIATIVADKCVNP 

ETKUPYTVILIFJIAMKDIHYSVKTNKSTKQQA 

LEVIKQLKEKMKIERAHMRLRFILPVNEGKKL 

KEKLKPLDXVIESEDYGQQLEIVCLIDPGCFREI 

DELIKKETKGKGSLEVLNLKDVEEGDEKFE 


962 


2312 


A 


8175 


286 


587 


NISNKAEVSSHPSV1SHSMDSFGQPRPEDNQS 
VLRRMQKKYWKTKQVFIKATGKKEDEHLVA 
SDAELDAKLEVFHSVQETCTELLKHEKYQLR 
LNGMKS 


963 


2313 


A 


8181 


13 


2215 


AEGCAERRGTEPWELSMSWESGAGPGLGSQ 

GMDLVWSAWYGKCVKGKGSLPLSAHGIW 

AWLSRAEWDQVTVYLFCDDHKLQRYALNRI 

TVWRSRSGNELPLAVASTADLIRCKLLDVTG 

GLGTDELRLLYGMALVRFVNLISERKTKFAK 

WLKQ^QEVNIPDWIVDLRHELTHKKMPHI 

NDCRRG CYFVLD WLQKTY WCRQLENSLREr 

WELEEFREGIEEEDQEEDKNIWDDITEQKPE 

PQDDGKSTESDVKADGDSKGSEEVDSHCKK 

ALSHKELYERARELLVSYEEEQFTVLEKFRYL 

PKAIKAWNNP SPRVECVL AELKG VTCENREA 

VLDAFLDDGFLVPTFEQLAALQIEYEENVDL 

NDVLVPKPFSQFWQPLLRGLHSQNFTQALLE 

RMLSELPALGISGIRPTY1LRWTVELIVANTKT 

GRNARRFSAGQWEARRGWRLFNCSASLDWP 

RMVESCLGSPCWASPQLLRUFKAMGQGLPD 

EEQEKLLRICSIYTQSGENSLVQEGSEASPIGK 

SPYTLDSLYWSVKPASSSFGSEAKAQQQEEQ 

GSVNDVKEEEKEEKEVLPDQVEEEEENDDQE 

EEEEDEDDEDDEEEDRMEVGPFSTGQESPTA 

ENARLLAQKRGALQGSAWQVSSEDVRWDTF 

PLGRMPGQTEDPAELMLENYDTMYLLDQPV 

LEQRLEPSTCKTDTLGLSCGVGSGNCSNSSSS 

NFEGLLWSQGQLHGLKTGLQLF 


964 


2314 


A 


8184 


6 


1393 


EPRRNFRDDSTRPRTRGRTRGRRRRACRSAE 

GTGLRSLL1/PPRLQLPAGPFSRCRWDPVS SPR 

PSTMPPKKGGDG1KPPPDGRFGTSLKIGIVGLP 

NVGKSTFFNVLTNSQASAENFPFCTIDPNESR 

VPVPDERFDFLCQYHKPASKIPAFLNVVDIAG 

LVKGAHNGQGLGNAFLSHISACDGIFHLTRA 

FEDDDITHVEGSVDPIRDIEIIHEELQLKDEEMI 

GPIIDKLEKVAVRGGDKKLKPEYDIMCKVKS 

WVTDQKKP VRF YHD WNDB1EIEVLNKHLFLTS 

KPMVYLVNXSEIO^YIRKKNKWLIKIKEWVD 

KYDPGALVEPFSGALELKLQELSAEERQKYLE 
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ANMTOSAIJ'KinCAOFAALQr^XBPXA QPDEY 
RAWTI^GTKAPQAAGmTDFEKOFIMAEV 
MKYEDFKEEGSENAVKAAGKYRQQGRNYIV 
EDGDHFFKFNTPQQPKKK 


965 


2315 


A 


8195 


1437 


594 


RSFSLSFSLLSPSEMMALGAAGATRVFVAMV 

AAALGGHPLLGVSATLNSVLNSNA1KNLPPPL 

GGAAGHPGSAVSAAPGILYPGGNKYQT1DNY 

QPYPCAEDEECGTDEYCASPTRGGDAGVQ1C 

LACRKRRKRCMRHAMCCPGNYCKNGICVSS 

DQNHFRGEIEETrrESFGNDHSTLDGYSRRTT 

LSSKMYHTKGQEGSVaRSSDCASGLCCARH 

FWSK3(XPVIJKEGQVCTKHRRK,GSHGLEIFQ 

RCYCGEGLSCRIQKDHHQASNSSRJLHTCQRH 


158 


2316 


A 


8207 


416 


4082 


KFKLIKIMLLTL11LLPWSKFSFVSLSAPQHW 

SCPEGTLAGNGNSTCV GPAPFLIFSHGNSIFRI 

DTEGTNYEQLWDAGVSVIMDFHYNEKRIY 

WVDLERQLLQRVFiNGSRQERVCNIEKNVSG 

MAIN W1NEEV1 WSNgQEGllTVTDMKGNNSHI 

LLSALKYPANVAVDPVERFIFWSSEVAGSLY 

RADLDGVGVKALLETSEKTTAVSLDVLDKRL 

FWIQYNREGSNSLICSCDYDGGSVHISKHPTQ 

HNLFAMSLFGDRIFYSTWKMKTIWIANKHTG 

KDMVRINLHSSFVPLGELKWHPLAQPKAED 

DTWEPEQKLCKLRKGNCSSTVCGQDLQSHLC 

MCAEGYALSRJDRKYCEGNDWKYCEDVNEC 

AFWNHGCTLGCKNTPGSYYCTCPVGFVLLPD 

GKRCHQLVSCPRNVSECSHDCVLTSEGPLCF 

CPEGSVLERDGKTCSGCS SPDNGGCSQLCVPL 

SPVSWECDCFPGYDLQLDEKSCAASGPQPFL 

LFANSQDIRHMHFDGTDYGTLLSQQMGMVY 

ALDHDP\^ENKIYFAHTALKWIERANMDGSQ 

RERLIEEGVDVPEGLAVDWIGRRFYWTDRGK 

SLIGRSDLNGKRSKI1TIEN1SQPRGIAVHPMAK 

RLFWTDTGINPRIESSSLQGLGRLVIASSDLIW 

PSGITIDFLTDKLYWCDAKQSVIEMANLDGSK 

RRRLTQNDVGHPFAVAVFEDYVWFSDWAMP 

SVIRVNKRTGKDRVRLQGSMLKPSSLWVHP 

LAKPGADPCLYQNGGCEHICKKRLGTAWCS 

CREGFMKASDGKTCLALDGHQLLAGGEVDL 

KNQVTPLDILSKTRVSEDNITESQHMLVAEIM 

VSDQDDCAPVGCSMYARCISEGEDATCQCLK 

GFAGDGKLCSDIDECEMGVPVCPPASSKCINT 

EGGYVCRCSEGYQGDGIHCLDIDECQLGVHS 

CGENASCTNTEGGYTCMCAGRLSEPGLICPD 

STPPPHLRJEDDHHYSVRNSDSECPLSHDGYa 

HDGVCMYIEALDKYACNCWGYIGERCQYR 

DLKWWELRHAGHGQQQKVIWAVCVVVLV 

MaLLSLWGAHYYRTQKLLSKNPKNPYEESS 

RDVRSRRPADTEDGMSSCPQPWFWIKEHQD 

LKNGGOPVAGEDOO A ADftRMnPTS WP oppa 

LCGMGTEQGCWIPVSSDKGSCPQVMERSFH 

MPSYGTQTLEGGVEKPHSLLSANPLWQQRAL 
DPPHQMELTQ 


967 


2317 


A " 


8210 


3 

i 


601 


SSAMGSRSSHAAVIPDGDSIRRETGFSOASLL 
RLHHRFRALDRNKKGYLSRMDLQQIGALAV 
NPLGDRIIESFFPDGSQRVDFPGFVRVLAHFRP 
VEDED1ETQDPKKPEPLNSRRNKLHYAFQLY 
DLDRDGKJSRHEMLQVLRLMVGVQVTEEQL 
ENIADRTVQEADEDGDGAVSFVEFTKSLEKM 
DVEHKMSIRILK 
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D=Aspartic Acid, E=Glutamic Acid, 
F=Phenylalanine, CKjlycme, H-Histidine, 
Msoleucine, K^Lysine, L*=Leucme, 
M=Methionine, N^Asparagine, P=Proline, 
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Y=Tyrosine, X=Unknown 3 *=Stop codon, 
/=possible nucleotide deletion, \«possible 
nucleotide insertion 


968 


2318 


A 


8211 


2 


409 


IS S CPHT A YEG SMSTL SNFTQTLED VFRRIFIT 

YMDNWRQNTTAEQEALQAKVDAENFYYVTL 

YLMV>flGMFSFIIVAILVSTVKSKRREHSNDP 

YHQYTVEDWQEKYKSQILNLEESKATIHEN1G 

AAGFKMSP 


969 


2319 


A 


8215 


1 


1938 


GMPRSRGGRAAPGPPPPPPPPGQAPRWSRWR 

VPGRLLLLLLPALCCLPGAARAAAAAAGAGN 

RAAVAVAVARADEAEAPFAGQNWLKSYGY 

LLPYDSRASALHSAKALQSAVSTMQQFYGIP 

VTGVLDQTTEWMKKPRCGVPDHPHLSRRRR 

NKRYALTGQKWRQKHlTreiHNYTPKVGELD 

TRKAIRQAFDVWQKVTPLTFEEVPYHEIKSDR 

KEADIMIFFASGFHGDSSPFDGEGGFLAHAYF 

PGPG1GGDTHFDSDEPWTLGNANHDGNDLFL 

VAVHELGHALGLEHSSDPSAIMAPFYQYMET 

HNFKLPQDDLQGIQKIYGPPAEPLEPTRPLPTL 

PVRRfflSPSERKHERQPRPPRPPLGDRPSTPGT 

KPNTCDGNFNTVAT .FRGEMFVFKDRWFWRL 

RNNRVQEGYPMQIEQFWKGLPARIDAAYER 

ADGRFVFFKGDKYWVFKEVTVEPGYPHSLG 

ELGSCLPREGtDTALRWEPVGKTYFFKGERY 

WRYSEERRATDPGYPKPITVWKG1PQAPQGA 

F1SKEGYYTYFYKGRDYWKFDNQKLSVEPGY 

PRNILRDWTMGCNQKEVERRKERRLPQDDVDI 

MVTINDWGSVNAVAVV1PCILSLCILVLVYTI 

FQFKNKTGPQPVTYYKRPVQEWV 


970 


2320 


A 


8216 


1235 


2223 


SRLSLQFYVSFRRTGLFTCKLIVEIFFRNYMN 

DSLRTNVFVRFQPETIACACIYLAARALQIPLP 

TRPHWFLLFGTTEEEIQEICIETLRLYTRKKPN 

YELLEKEVEKRKVALQEAKLKAKGLNPDGTP 

ALSTLGGFSPASKPSSPREVKAEEKSPISINVK 

TVKKEPEDRQQASKSPYNGVRKDSKRSRNSR 

SASRSRSRTRSRSRSHTPRRHYNNRRSRSGTY 

SSRSRSRSRSIISESPRRHHNHG SPHLKAKHTR 

DDLKSSNRHGHKRKKSRSRSQSKSRDHSDAA 

KKHRHERGHHRDRRERSRSFERSHKSKHHGG 

SRSGHGRHRR 


971 


2321 


A 


8217 


3 


3274 

! 


DCRLQAAMPTNFTVVPVEAHADGGGDETAE 

RTEAPGTPEGPEPERPSPGDGNPRENSPFLNN 

VEVEQESFFEGKNMALFEEEMDSNPMVSSLL 

NKLANYTNLSQGVVEHEEDEESRRREAKAPR 

MGTFIGVYLPCLQNILGVILFLRLTWIVGVAG 

VLESFLIVAMCCTCTMLTAISMSAIATKGVVP 

AGGSYYMISRSLGPEFGGAVGLCFYLGTTFA 

GAMYILGT1EIFLTYISPG AAEFQ AEAAG GEAA 

AMLHNMRVYGTCTLVLMALVVFVGVKYVN 

KLALVFLACVVLSILAJYAGVIKSAFDPPDIPV. 

CLLGNRTLSRRSFDACVKAYGIHNNSATSAL 

WGLFCNGSQPSAACDEYFIQNNVTEIQGIPGA 

ASGVFLENLWSTYAHAGAFVEKKGVPSVPV 

AEESRASTLPYVLTDIAASFTLLVGIYFPSVTG 

IMAGSNRSGDLKDAQKSIPTGTILAJVTTSFIY 

LSCIVLFGACIEGVVLRDKFGEALQGNJtVIGM 

LAWPSPWVIVIGSFFSTCGAGLQTLTGAPRLL 

QAIARDGIVPFLQVFGHGKANGEPTWALLLT 

VLICETGILIASLDSVAPILSMFFLMCYLFVNL 

ACAVQTLLRTPNWRPRFKFYHWTLSFLGMSL 

CLALMFICSWYYALSAMLIAGCIYKYIEYRG 

AEKEWGDGIRGLSLNAARYALLRVEHGPPHT 

KKWRPQVLVMLNLDAEQAMKHPRLLSFTSQ 
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D=A$partic Acid, EKHutamic Acid, 
F=Phenylalanine, OOlycine, H=4Iistidine, 
I-Isoleucine, K=*Lysine, Leucine, 
M=Methionine, N«Asparagine, P^Proline, 
QNjlutamine, R»Arginine, S=Serine, 
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Y=Tyrosine, X=Unknown, *=Stop codon, 
/=possibIe nucleotide deletion, possible 
nucleotide insertion 














LKAGKGLTIVG S VLEGTYLDKHMEAQRAEE 

in iKoLMo i his. i fjjr LCjL VVSSSLRIXjMSHLIQ 

SAGLGGUCHNTVLMAWPASWKQEDNPFSW 

KNFVDTVRDTTAAHQALLVAKNVDSFPQNQ 

ERFGGGH1DVWWIVHDGGMLMLLPFLLRQH 

KVWRKCRMRIFTVAQVDDNSIQMJCKDLQNIF 

L i rtLKl a At, Vfc V VEMVENDIS AFTYERTLMM 

EQRSQMLKQMQLSKNEQEREAQL1HDRNTAS 

HTAAAARTQAPPTPDKVQMTWTREKLIAEK 

YRSRDTSLSGFKDLFSMKPDQSNVRRMHTAV 

KLNGWLNKSQDAQLVLLNMPGPPKNRQGD 

ENYMEFLEVLTEGLNRVLLVRGGGREVITIYS 


972 


2322 


A 


8224 


701 


246 


TSRRVTMKFNPFVl-SDRSKNRKRHb'NAPSHV 

RRKIMSSPLSKELRQKYNVRSMPIRKDDEVQ 

WRGHYKGQQIGKWQVYRKKYVIYIERVQ 

REKANGTTVHVGIHPSKVVITRLKLDKDRKKI 

LERKAKSRQVGKEKGKYKEELIEKMQE 


973 


2323 


A 


8237 


873 


4610 


GCPHAGGKGRVPTGGLTGGRTWSPSAAPRSC 

PRPGPTPAPGAMDKLPPSMRKRLYSLPQQVG 

AKAWIMDEEEDAEEEGAGGRQDPSRRSIRLR 

PLPSPSPSAAAGGTESRSSALGAADSEGPARG 

AGKSSINGDCRRFRGSLASLGSRGGGSGGTG 

SGSSHGHLHDSAEERRL1AEGDASPGEDRTPP 

GLAAEPERPGASAQPAASPPPPQQPPQPASAS 

CEQPSVDTAIKVEGGAAAGDQDLPEAEVRLG 

QAGFMQRQFGAMLQPGVNKPSLRMFGSQKA 

VEREQERVKSAGFWUHPYSDFRFYWDLTML 

LLr^GNLlIffVGITrTKDErTrrPWTV^WSD 

TrTLIDLVLNFRTGIVVEDNrEnLDPQRrXMK 

YLKSWFM\DFISSIPVDYIFLIVETRIDSEVYK 

TARALRIVRFTKILSLLRLLRLSRLIRYIHQWE 

EIFHNlTYDLASAVVRrVNLIGMMLLLCHWDG 

CLQFLVPMLQDFPDDCWVSINNMVNNSWGK 

QYSYALFKAMSIIMLCIGYGRQAPVGMSDV 

WLTMLSMIVGATCY AMFIGHATALIQSLDSS 

RRQYQEKYKQVEQYMSFHKLPPDTRQRIHD 

YYEHRYQGKMFDEESILGELSEPLREEIINFNC 

RKLVASMPLFANADPNFVTSMLTKLRFEVFQ 

PGDYIIREGTtGKKMYFIQHGWSVLTKGNKE 

TKLADGSYFGEICLLTRGRRTASVRADTYCR 

LYSLSVDNFNEVLEEYPMMRRAFETVALDRL 

DRIGKKNSILLHKVQHDLNSGVFNYQENEIIQ 

QIVQHDREMAHCAHRVQAAASATPTPTPVIW 

TPLIQAPLQAAAATTSVAIALTHHPRLPAAIFR 

PPPGSGLGNXGAGQTPRHLKRLQSLIPSALGS 

ASPASSPSQVDTPSSSSFHIQQLAGFSAPAGLS 

rLL^bbbaSPrPGACuSPSAIoPSAGVAATTlA 

GFGHFHKALGGSLSSSDSPLLTPLQPGARSPQ 

AAQPSPAPPGARGGLGLPEHFLPPPPSSRSPSS 

SPGQLGQPPGELSLGLATGPLSTPETPPRQPEP 

PSLVAGASGGASPVGFTPRGGLSPPGHSPGPP 

RTFPSAPPRASGSHGSLLLPPASSPPPPQVPQR 

RGTPPLTPGRLTQDLKUSASQPALPQDGAQT 

LRRASPHSSGESMAAFPLFPRAGGGSGGSGS S 

GGLGPPGRPYGAIPGQHVTLPRKTSSGSLPPP 

LSLFGARATSSGGPPLTAGPQREPGARPEPVR 

SKLPSNL 


974 


2324 


A 


8247 


279 


468 


EYKQWERRFLSCQNRNDLGYGKPRKGGGLL 
LVPVKDASRICSLTYLLGSHWNNLWRSPVL 
G 
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D=Aspartic Acid, E-Olutamic Acid, 
F=PhenyIalanine, G=Glycine, HHffistidine, 
Msolcucine, K=Lysine, L^Leucine, 
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Y=Tyrosine, X=Unknown f * 22 Stop codon, 
/=possible nucleotide deletion, V=possible 
nucleotide insertion 


975 


2325 


A 


8249 


62 


1571 


LVALKNWKPKGTN1PAPQSPVFGEAVSGVYM 

MTKVLGMAPVLGPRPPQEQVGPLMVKVEEK 

EEKGKYLPSLEMFRQRFRQFGYHDTPGPREA 

LSQLRVLCCEWLRPEIHTKEQILELLVLEQFLT 

DLPQELQAWVQEHCPESAEEAVTLLEDLEREL 

DEPGHQVSTPPNEQKPVWKISSSGTAKESPS 

SMQPQPLETSHKYESWGPLYIQESGEEQEFAQ 

DPRKVRDCRLSTQHEESADEQKGSEAEGLKG 

DHSVI1ANKPEASLERQCVNLENEKGTKPPLQ 

EAGSKKGRESVPTKPTPGERRY1CAECGKAFS 

NSSNLTKHRRTHTGEKPYVCTKCGKAFSHSS 

NLTLHYRTHLVDRPYDCKCGKAFGQSSDLLK 

HQRMHTEEAPYQCKDCGKAFSGKGSURHYR 

IHTGEKPYQCNECGKSFSQHAGLSSHQRLHT 1 

GEKP YKCKECGKAFNHS SNFNKHHRIHTGEK 

PYWCHHCGKTFCSKSNLSKHQRVHTGEGEA 

P 


976 


2326 


A 


8257 


298 


7086 


GNMACWPQLRLLLWKNLTFRRRQTCQLLLE 

VAWPLFIFLILISVRLSYPPYEQHECHFFNKAM 

PSAGTLPWVQG1ICNANNPCFRYPTPGEAPGV 

VGNFNKS1VARLFSDARRLLLYSQKDTSMKD 

MRKVLRTLQQ1KKSSSNLKLQDFLVDNETFS 

GFLYHNLSLPKSTVDKMLRADVILHKVFLQG 

YQLHLTSLCNGSKSEEMIQLGDQEVSELCGLP 

REKLAAAERVLRSNMDILKPILRTLNSTSPFPS 

KELAEATKTLLHSLGTLAQELFSMRSWSDMR 

QEVMFLTNVNSSSSSTQIYQAVSRIVCGHPEG 

GGLKIKSLNWYEDNNYKALFGGNGTEEDAE 

TFYDNSTTPYCNDLMKNLESSPLSRIIWKALK 

PLLVGKILYTPDTPATRQNTvlAEVNKTFQELA 

VFHDLEGMWEELSPKnVTFMENSQEMDLVR 

MLLDSRDNDHFWEQQLDGLDWTAQDIVAFL 

AKHPEDVQSSNGSVYTWREAFNETNQAIRTIS 

RFMECVNLNKLEPIATEVWLINKSMELLDER 

KFWAGATTGITPGSIELPHHVKYKIRMGIDN 

VERTNKIKDGYWDPGPRADPFEDMRYVWGG 

FAYLQDWEQAIIRVLTGTEKKTGVYMQQMP 

YPCYVDDIFLRVMSRSMPLFMTLAWIYSVAV 

OKGIVYEKEARLKETMRIMGLDNSILWFSWFI 

SSLIPLLVSAGLLWILKLGNIXPYSDPSWFV 

FLSVFAVVTILQCFLISTLFSRANLAAACGGII 

YFTLYLPYVLCVAWQDYVGFTLIOFASLLSP 

VAFGFGCEYFALFEEQGIGVQWDNLFESPVE 

EIXjFNLTTSVSMMLFDTFLYGVMTWYIEAVF 

PGQYGPRPWYFPCTKSYWFGEESDEKSHPGS 

NQKRISEICMEEEPTHLKLGVSIQNLVKVYRD 

GMKVAVDGLALNFYEGQITSFLGHNGAGKT 

TTMSILTGLFPPTSGTAYILGKDIRSEMSTIRQ 

NLGVCPQHNVLFDMLTVEEHIWFYARLKGLS 

EKHVKAEMEQMALDVGLPSSKLKSKTSQLS 

GGMQRKLSV ALAFV GGSKWILDEPTAG VDP 

YSRRGlWELLLKYRQGRmSTHHMDEADVL 

GDRIAIISHGKLCCVGSSLFUCNQLGTGYYLT 

LVKKDVESSLSSCRNSSSTTVSYLKKEDSVSQS 

SSDAGLGSDHESDTLTIDVSA1SNLIRKHVSEA 

RLVEDIGHELTYVLPYEAAKEGAFVELFHEID 

DRLSDLGISSYGISETTLEEIFLKVAEESGVDA 

ETSDGTLPARRNRRAFGDKQSCLRPFTEDDA 

ADPNDSDIDPESRETDLLSGMDGKGSYQVKG 

WKLTQQQFVALLWKR1 Jf JARRSRKGFFAQIV 
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Y^Tyrosine, X^nknown, *=Stop codon, 
possible nucleotide deletion, ^possible 
nucleotide insertion 














IJAWVCIALVFSLTVPPFGKYPST.PJ.nPWMV 

NEQYTFVSNDAPEDTGTLELLNALTKDPGFG 

TRCMEONPIPDTPCQAGEEEWTTAPVPQTIM 

DU?QNGNU r TMQNPSPACQCSSDKIKKMLPV 

CPPGAGGLPPPQRKQNTADHQDLTGRNISDY 

LVKTYVQIlAKSLrvNKIWVNEFRYGGFSLGVS 

NTQALPPSQEV>TOATKQN1KKHLKLAKDSSA 

DRFLNSUjRF'MTGLDTHKNVKVWFNNKGW 

HAISSFLNVINNA1LRANLQKGENPSHYGITAF 

NHPLNLTKQQLSEVAPMTTSVDVLVSICVIFA 

MSFVPASFVVFLIQERVSKAKHLQFISGVKPVI 

YWLSNFVWDMCNYWPATLVniFICFQQKSY 

VSSTNLPVLALLLLLYGWSITPLMYPASFVFK 

IPSTAYVVLTSVNLFIGINGSVATFVLELFTDN 

KLNNINDILKSVFLIFPHFCLGRGLIDMVKNQ 

AMADALERFGENRFVSPLSWDLVGRNLFAM 

AVEGWFFLITVLIQYRFFIRPRPVNAKLSPLN 

DEDEDVRRERQRILDGGGQNDILEIKBLTKIY 

RRKRJKPAVDRICVGIPPGECFGLLGVNGAGK 

SSTFKMLTGDTTVTRGDAFLNRNSILSN1HEV 

HQNMGYCPQFDAITELLTGRJEHVEFFALLRG 

VPEKEVGKVGEWAIRKLGLVKYGEKYAGNY 

SGGNKRKLSTAMAUGGPPVVFLDEPTTGMD 

PKARRFLWNCALSWKEGRSWLTSHSMEEC 

EALCTRMAIMVNGRFRCLGSVQHLKNRFGD 

GYTIWRIAGSNPDLKPVQDFFGLAFPGSVPK 

EKHR>JMLQYQLPSSLS SL.ARIFSILSQSKKRLH 

IEDYSVSQTTLDQVFVNFAKDQSDDDHLKDL 

SLI IKNQTVVD VAVLTSFLQDEK VKESYV 


977 


2327 


A 


8260 


3 


1567 


IPGSTISFSLCFIFPPCVPTMVRKPVVSTISKGG 

YLQGNVNGRLPSLGNKEPPGQEKVQLKRKV 

TLLRGVSUIGTIIGAGIFISPKGVLQNTGSVGM 

SLTIWTVCGVLSLFGALSYAELGTTKKSGGH 

YTYELEVFGPLPAFVRVWVELLIIRPAATAVIS 

LAFGRYILEPFFIQCEIPELAIKLrrAVGITWM 

VLNSMSVSWSARIQ1FLTFCKLTA1LIIIVPGV 

MQLnCGQTQNFKDAFSGRDSSlTRLPLAFYYG 

MYAYAGWFYLNFVTEEVENPEKTIPLAICISM 

AIVTIGYVLTNVAYFTTINAEELLLSNAVAVT 

FSERLLGNFSLAVPIFVALSCFGSMNGGVFAV 

SRLFYVASREGHLPErLSMlHVRKHTPLPAVIV 

LHPLTMIMLFSGDLDSLLNFLSFARWLFIGLA 

VAGLIYLRYKCPDMHRFFKVPLFIPALFSFTC 

LFMVALSLYSDPFSTGIGFVITLTGVPAYYLFII 

WDKKPRWFRIMSEKITRTLQIILEVVPEEDKL 


978 


2328 


A 


8261 


2 


2165 


RGGSLRCVLGKLLGQLLCFQSERCVRFPEGLL" 
RHRGCGLLSSRLSAGKPPLRTSFFGSWGVLPP 
LADAASMSGVRAVRIS1ESACEKQVHEVGLD 
GTETYLPPLSMSQNLARLAQRIDFSQGSGSEE 

PEA AfJTE/VnArNJU/D/'S/l r>00 \T\r\T\r\nnrrKnTv 

cjiaavj i EKjUnKlE, W r UAuooAUyjJUEiiGV VK 

FQPSLWPWDSVRNNLRSALTEMCVLYDVLSI 

VRDKKFMTLDPVSQDALPPKQNPQTLQLISK 

KKSLAGAAQILLKGAERLTBCSVTENQENKLQ 

RDFNSELLRLRQHWKLRKVGDKELGDLSYRS 

AGSLFPHHGTFEVIKNTDLDLDKKIPEDYCPL 

DVQIPSDLEGSAYIKVSIQKQAPDIGDLGTVN 

LFKRPLPKSKPGSPHWQTKLEAAQNVLLCKEI 

FAQLSREAVQIKSQVPHIVVKNQnSQPFPSLQ 

LSISLCHSSNDKKSQKFATEKQCPEDHLYVLE 

HNLHLI..TREFHKQTLS SIMMPHP ASAPFGHKR 
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D^Aspartic Acid, E=Qlutamic Acid, 
phenylalanine, Glycine, H=Histidine, 
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T=aTireonme, V=Valine, W=Tryptophan, 
Y«Tyrosine, X=Unknown, *~Stop codon, 
/^possible nucleotide deletion, V=possible 
nucleotide insertion 














MRLSGPQAFDKNEINSLQSSEGLLEKIIKQAK 

HJFUISRAAATTDSLASRIEDPQIQAHWSNIND 

VYESSVKVLITSQGYEQICKSIQLQLNIGVEQI 

RWHRDGRVITLSYQEQELQDFIJLSQMSQHQ 

VHAVQQLAKVMGWQVLSFSNHVGLGPIESIG 

NASAITVASPSGDYAISVRNGPESGSKIMVQF 

PRNQCKDLPKSDVLQDNKWSHLRGPFKEVQ 

WNKMEGRNF VYKMELLMS AT .SPCLL 


979 


2329 


A 


8289 


2 


1053 


FVWNPRGGRKRRRQAAVTQAATRASGTPSP 

RDGTMTQGKLSVANKAPGTEGQQQVHGEKK 

EAPAVP S APPS YEEATS GEGMKAGAFPPAPTA 

VPLHPSV/AYVDPSSSSSYDNGFPTGDHELFTT 

FS WDDQBCVRRVF VRKVYTILLIQL1 /VTT AW 

ALFTFCDPVKDYVQANPGWYWASYAVFFAT 

YLTLACCSGPRRHFPWNLDXTVFTLSMAYLT 

GMLSSYYNTTSVLLCLGITALVCLSVTVFSFQ 

TKFDFTSCQGVLFVLLMTLFFSGLILAILLPFQ 

YVPWLHAVYAALGAGVFTLFLALDTQLLMG 

NRRHSLSPEEYIFGALNIYLDIIYIFTFFLQLFG 

TNRE 


980 


2330 


A 


8305 


59 


857 


ASQLPDYSISPPSLPPRISFHPSFTLARVAMAEP 

SEATQSHSISSSSFGAEPSAPGGGGSPGACPAL 

GTKSCSSSCAVHDLIFWRDVKKTOFVFGTTLI 

MLLSLAAFSVISVVSYLILALLSVTISFRIYKSV 

IQAVQKSEEGHPFKAYLDVDITLSSEAFHNY 

MNAAMVHINRALKLI1RLFLVEDLVDSLKLA 

VFMWLMTYVGAVFNGITLLILAELL1FSVP1V 

YEKYKTQIDHYVGIARDQTKSIVEKIQAKLPG 

IAKKKAE 


981 


2331 


A 


8308 


186 


1337 


TRMSRHEGVSCDACLKGNFRGRRYKCLICYD 

YDLCASCYESGATTTRHTTDHPMQCILTRVD 

FDLYYGGEAFSVEQPQSFTCPYCGKMGYTET 

SLQEHVTSEHAETSTEV1CPICAALPGGDPNH 

VTDDFAAHLTLEHRAPRDLDESSGVRIIVUR 

MFHPGRGLGGPRARRSNMHFTSSSTGGLSSS 

QSSYSPSNREAMDPIABLLSQLSGVRRSAGGQ 

LNSSGPSASQLQQLQMQLQLERQHAQAARQ 

QLETARNATRRTNTSSVTTTITQSTATTNIAN 

TESSQQTLQNSQFLLTRLNDPKMSETERQSM 

ESERADRSLFVQELLLSTLVREESSSSDEDDR 

GEMADFGAMGCVDIMPLDYALENLNLKESN 

KGNEPPPPPL 


982 


2332 


A 


8315 


1 


1004 


GSTHASADAWAQWFCTEALVMGAPVWYLV 
AAALLVGFILFLTRSRGRAASAGQEPLHNEEL 
AGAGRVAQPGPLEPEEPRAGGRPRRRRDLGS 
RLQAQRRAQRVAWAEADENEEEAVILAQEE 
EGVEKPAETHLSGKJGAKKLRKLEEKQARKA 
QREAEEAEREERKRLESQREAEWKKEEERLR 
LEEEQKEEEERKAREEQAQREHEEYLKLKEA 
FVVEEEGVGBTMTEEQSQSFLTEFINYIKQSK. 
VVLLEDLASQVGLRTQDTINRIQDLLAEGTIT 
GVIDDRGKFIYITPEELAAVANFIRQRGRVSIA 
ELAQASNSLIAWGRESPAQAPA 


983 


2333 


A 


8320 


244 


1420 


RRRWRARGGLVPTLAWAEATGAYVPGRDKP 
DLPTWKRNFRSALNRKEGLRLAEDRSKDPHD 
PHKIYEFVNSGVGDFSQPDTSPDTNGGGSTSD 
TQEDILDELLGNMVLAPLPDPGPPSLAVAPEP 
CPQPLRSPSLDNPTPFPNLGPSENPLKRIXVPG 
EEWEFEVTAFYRGRQVFQQTISCPEGLRLVGS 
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Amino acid sequence (A-Alanine OCysteine, 
D^Aspartic Acid, B=Glutaraic Acid, 
^Phenylalanine, GKHycine, H=Histidine, 
I=Isoleucine, K=Lysine, LpLeucine, 
M=Methionine, N-Asparagine, P=Proline, 
Q=Glutamine, R=Arginrae, S«Serme, 
Threonine, V-Valine, W-Tryptophan, 
Y-Tyrosine, X^Unknown, *«Stop codon, 
^possible nucleotide deletion, \FpossiWe 
nucleotide insertion 






I 








RHVLSCLGGGLALWRA(^WL^(^GHCH 

TYWAVSEELLPNSOHGPDGEVPKDKEGOVF 

DLGPIWGSLGPPDLITFTEGSGRSPRYALWFC 

VGESWPQDQPWTKRLVMVKWPTCLRALVE 

MARVGGASSLENTVDLfflSNSHPLSLTSDQY 

KAYLQDLVEGMDFQGPGES 


984 


2334 


A 


8321 


1 


1243 


ANMAPVEHWADAGAFLRHAALQDIGKNIY 

TIREVVTEIRDKATRRRLAVLPYELRFKEPLPE 

YVRLVTEFSKKTGDYPSLSATDIQVLALTYQL 

EAEFVGVSHLKQEPQKVKVSSSIQHPETPUnS 

GFHLPYKPKPPQETEKGHSACEPENLEFSSFM 

FWRNPLPNIDHELQELLIDRGEDVPSEEEEEEE 

NGFEDRKDDSDDDGGG^TTPSNIKQIQQELE 

QCDVPEDVRVGCLTTDFAMQNVLLQMGLHV 

LAVNGMLIREARSYILRCHGCFKTTSDMSRV 

FCSHCGNKUKKVSVTVSDDGTLHMHFSRNP 

KVLNPRGLRYSLPTPKGGKYAINPHLTEDQRF 

PQLRLSQKARQJCTNVFAPDYIAGVSPFVENDI 

SSRSATLQVRDSTLGAGRRRLNPNASRKKFV 

KKR 


985 


2335 


A 


8322 


352 


529 


RRNNIRQFIMKVCISGQARWLTPWPVLWET 
EAGRSLELKSLRPAWATWGNPISTKINK 


986 


2336 


A 


8325 


89 


1172 


KMWTD[ADTTLDESIYSNYYLY^IPCTCTKE 

GIKAFGELFLPPLYSLVFVFGLLGNSVWLVL 

FKYKiaRSMTDVYLLNLAISDLLFVFSLPFWG 

YYAADQWVFGLGLCKM1SWMYLVGFYSGIF 

FVMLMSEDRYLAIVHAVFSLRARTLTYGVITS 

LATWSVAVFASLPGFLFSTCYTERNHTYCKT 

KYSLNSTTWKVLSSLEIN1LGLVIPLGIMLFCY 

SMiniTLQHCKr^KKNKAVKMIFAVVVLFLG 

FWTPYNiyLFLETLVELE^QDCTFERYLDYA 

IQATETLAFVHCCLNPIIYFKLGEKFRKYILQL 

FKTCRGLFVLCQYCGLLQIYSADTPSSSYTQS 

TMDHDLHDAL 


987 


2337 


A 


8326 


3 


470 


SLSAMRFLAATFLLLALSTAAQAEPVQFKDC 
GSVDGVIKEVNVSPCPTQPCQLSKGQSYSVN 
VTFTSNIQSKSSKAVVHGILMGVPVPFPIPEPD 
GCKSGINCPIQKDKTYSYLNKXPVKSEYPStK 
LWEWQLQDDKNQSLFCWEDPVQIVSHL 


988 


2338 


A 


8335 


1205 


323 


VIKMALAARLLPQFLHSRSLPCGAVRLRTPA 

VAEVRLPSATLCYFCRCRLGLGAALFPRSAR 

ALAASALPAQGSRWPVLSSPGLPAAFASFPAC 

PQRSYSTEEKPQQHQKTKMIVLGFSNPINWV 

RTRKAFUWAYFDKEFSITEFSEGAKQAFAH 

VSKLLSQCKFDLLEELVAKEVLHALKEKVTS 

LPDNHKNAI^UNIDErvrrsrrGDISIYYDEKG 

RKFVNILMCFWYLTSANIPSETLRGASVFQVK 

LGNQNVETKQLLSASYEFQREFTQGVKPDWT 

IARIEHSKLLE . 


989 


2339 


A 


8349 


67 


185 


MSGFIHQLUQNLFCVYHTRLKTSQGLCLLSL 
KSLHPMS 


"990 


2340 


A 


8361 


210 


1115 


ASPFLRPQGHDSGEREPFSQTPGLMQPFSIPVQ 

1TLQGSRRRQGRTAFPASGKKRETDYSDGDPL 

DVHKRLPSSTGEDRAVMLGFAMMGt"SVLMF 

FLLGTTILKPFMLSIQREESTCTAIHTDIMDDVV 

LDCAFTCGVHCHGQGKYPCLQVFVNLSHPG 

QKALLHYNEEAVQINPKCFYTPKCHQDRNDL 

LNSALDEKEFFDHKNGTPFSCFYSPASQSEDVI 
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D=Aspartic Acid, £=Glutamic Acid, 
F=Phenylalanine, GKJlycine, HHKistidine, 
Msoleucine, KHLysine, Wxucine, 
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Q=Glutamine, R^Arginine, S=Serine, 
T=Threonine, V=Valine, W=Tryptophan 3 
Y=Tyrosine, X=Unknown, *=Stop codon, 
/^possible nucleotide deletion, V=possible 
nucleotide insertion 














LIKXYDQMAIFHCLFWPSLTLLGQALIVGMV 
RLTQHLSLLCEKYSTWRDEVGOKVPY1EQH 
QFKLCIMRRSKGRAEKS 


991 


2341 


A 


8369 


9 


921 


S S V VEFSALSVSM ACLSPSQLQKFQQDGFLVL 

EGFL S AEEC V AMQQRIGEIVAEMDVPLHCRT 

EFSTQEEEQLRAQGSTDYFLSSGDK1RFFFEK 

GVFDEKGNFLVPPEKSINKIGHALHAHDPVFK 

SITHSFKVQTLARSLGLQMPWVQSMYIFKQP 

HFGGEVSPHQDASFLYTEPLGRVLGVWIAVE 

DATLENGCLWFIPGSHTSGVSRRMVRAPVGS 

APGTSFLGSEPARDNSLFVPTPVQRGALVLIH 

GEVVHKSKQNLSDRSRQAYTFHLMEASGTT 

WSPEN WLQPTAELPFPQLY T 


992 


2342 


A 


8370 


906 


4 


MALS^CSRYYPREQGSAVPhiSfPEWELNV 

GGQVYFTRHSTLISIPHSLLWKMFSPKRDTAN 

DLAKDSKGRFFIDRDGFLFRYILDYLRDRQYV 

LPDHFPEKGRLKREAEYFQLPDLVKLLTPDEI 

KQSPDEFCHSDFEDASQGSDTRJCPPSSLLPAD 

RKWGF1TVGYRGSCTLGREGQADAKFRRVPR 

ILVCGRISLAKEVFGETLNESRDPDRAPERYTS 

RFYLKFKHLMGAPASNFILGFWGLGQNQDK 

HPVNIYLQQRSVIRPDLTSKKAGDLKGKGDA 

QEVSRRRRWLGDPEHL 


993 


2343 


A 


8379 


1 


2794 


MRMQRHICNDTMDFGDSGKRIGGGVLCLLHQ 

SNTSF1KLNNNGFEDIV1VIDPSVPEDEKI1EQIE 

DMVTTASTYLFEATEKJOTFKNVSILIPEMWK 

ENPQYKRPKHENHKHADVIVAPPTLPGRDEP 

YTKQFTECGEKGEYIHFTPDLLLGKKQNEYG 

PPGKLFVHEWAHLRWGVFDEYNEDQPFYRA 

KSKKIEATRCSAGISGRNRVYKCQGGSCLSRA 

CRIDSTTKLYGKDCQFFPDKVQTEKASIMFM 

QSIDSVVEFCNEKTHNQEAPSLQNIKCNFRST 

WEVISNSEDFKNTIPMVTPPPPPYFS1XKIRQRI 

VCLVLDKSGSMGGKDRLNRMNQAAKHFLLQ 

WENGSWVGMVHFDSTA1TVNKLIQIKSSDER 

NTLMAGLPTYPLGGTSICSGIKYAFQVIGELH 

SQLDGSEVULLTDGEDNTASSCIDEVKQSGAI 

VHFIALGRAADEAVIEMSKrrGGSHFYVSDEA 

QNNGLIDAFGALTSGNTDLSQKSLQLESKGLT 

LNSNAWMNDTVIIDSTVGKDTFFLTTWNSLPP 

SISLWDPSGTIMENFTVDATSKMAYLSIPGTA 

KVGTWAYNLQAKANPETLTITVTSRAANSSV 

PPITVNAKMNKDVNSFPSPMI\nfAEID5GYW 

VLGANVTAF1ESQNGHTEVLELLDNGAGADS 

FKNDGVYSRYFTAYTENGRYSLKVRAHGGA 

NTARLKLRPPLNRAAYIPGWWNGEIEANPP 

RPEIDEDTQTTLEDFSRTASGGAFWSQVPSL 

PLPDQYPPSQITDLDATVHEDKIILTWTAPGD 

NFDVGKVQRYIIRISASILDLRDSFDDALQVN 

TTDLSPKEANSKESFAFKPENISEENATHIFIAI 

KSIDKSNLTSKVSNIAQVTLFIPQANPDDIDPT 

PTPTPTPTPDKSHNSGVN1STLVLSVIGSVVIV 

NFELSTTI 


994 


2344 


A 


8385 


231 


644 


INSSPRTGRDHQHLNLHl'ERlDSRSQRA v Lkip 
RQNPGIFYWIFLPSRSHSASHGSRQRQVSCQG 
TQDEILKMRNTFAELKNSLEALSSRMDQAEE 
RIGTQAGVQWRDHGSLQPQPPEFKQCFHLSL 
PSSWDYRACLS 


995 


2345 


A 


8390 


194 


3421 


AWRKSSWPPRtitRRGEKSDQDKSGQKNKR 
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D=Aspartic Acid, E=Glutamic Acid, 
^Phenylalanine, 0=01ycine, H=Hisridinc, 
Hsoleucine, K^Lysine, L=Leucine, 
M=Methionine, N=Asparagine, P=Proline, 
Q=Glutnmine, R=Arginine, S=Serine, 
T-Threonine, V-Valine, W-Tryptophan, 
Y-Tyrosine 5 X»Unknown, *-Stop codon, 
A=possibIe nucleotide deletion, \=possib!e 
nucleotide insertion 














NNMGNGCSQKLATANLLRFLLLVLIPCICALV 

LLLEILLSYVGTLQKVYFKSNGSEPLVTDGEI 

QGSDV1LTNTIYNQSTVVSTAHPDQHVPAWT 

TDASLPGODQSHRNTSACMNITHSQCQMLPYH 

ATLTPLLSVVRNMEMEKFLKFFTYLHRLSCY 

QHIMLFGCTLAFPECIIDGDDSHGLLPCRSFCE 

AAKEGCESVLGMVNYSWPDFLRCSQFRNQT 

ESSNVSRICFSPQQENGKQLLCGRGENFLCAS 

GICIPGKLQCNGYNDCDDWSDEAHCNCSENL 

FHCHTGKCLNYSLVCDGYDDCGDLSDEQNC 

DCNPTTEHRCGDGRCIAMBWVCDGDHDCVD 

KSDEVNCSCHSQGLVECRNGQCIPSTFQCDG 

DEDCKDGSDEENCSVIQTSCQEGDQRCLYNP 

CLDSCGGSSLCDFNNSLNNCSQCEPITLELCM 

NLPYNSTSYPNYFGHRTQKEASISWESSLFPA 

LVQTNCYKYLMFFSCTILVPKCDVNTGEH1PP 

CRALCEHSKERCES VLGJ VGLQ WPEDTDCSQ 

FPEENSDNQTCLMPDEYVEECSPSHFKCRSGQ 

CVLASRRCDGQADCDDDSDEENCGCKERDL 

WECPSNKQCLKHTVICDGFPDCPDYMDEKN 

CSFCQDDELECANHACVSRDLWCDGEADCS 

DSSDEWDCVTT STNVNSSSFT MVHTl A ATFT-TT4 

VCADGWQE1LSQLACKQMGLGEPSVTKUQE 

QEKEPRWLTLHSNWESLNGTTLHELLVNGQS 

CESRSKISIXCTKQDCGRRPAARMNKRILGGR 

TSRPGRWPWQCSLQSEP3GHICGCVLIAKKW 

VLTVAHCFEGRENAAVWKVVLGINNLDHPS 

VFMQTRFVKTIILHPRYSRAVVDYDISIVELSE 

DISETGYVRPVCLPNPEQWLEPDTYCYITGW 

GHMGNKMPFKLQEGEVRIISLEHCQSYFDMK 

TITTRMICAGYESGTVDSCMGDSGGPLVCEK 

PGGRWTLFGLTSWGSVCFSKVLGPGVYSNVS 

YFVEWDKRQIYIQTFLLN 


996 


2346 


A 


8392 


199 


3085 


KVILSSEMSKTNKSKSGSRS SRSRSASRSRSRS 

FSKSRSRSRSLSRSRKRRLSSRSRSRSYSPAHN 

KERNHPRVYQNRDFRGHNRGYRRPYYFRGR 

NRGFYPWGQYNRGGYGNYRSNWQNYRQAY 

SPRRGRSRSRSPKRRSPSPRSRSHSRNSDKSSS 

DRSRRSSSSRSSSNHSRVESSKRKSAKEKKSSS 

KDSRPSQAAGDNQGDEVKEQTFSGGTSQDTK 

ASESSKPWPDATYGTGSASRASAVSELSPRER 

SPALKSPLQSVWRRRSPRPSPVPKPSPPLSST 

SQMGSTLPSGAGYQSGTHQGQFDIIGSGSLSP 

SKKSPVGKSPPSTGSTYGSSQKEESAASGGAA 

YTKRYLEEQKTENGKDKEQKQTNTDKEKJKE 

KGSFSDTGLGDGKMKSDSFAPKTDSEKPFRG 

SQSPKRYKLRDDFEKKMADFHKEEMDDQDK 

DKAKGRKESEFDDEPKFMSKVIGANKNQEEE 

KSGKWEGLVYAPPGKEKQRICTEELEEESFPE 

RSKKEDRGKRSEGGHRGFVPEKNFRVTAYK 

AVQEKSSSPPPRKTSESRDKLGAKGDFPTGKS 

SFSnilEAQVNVRMDSFDEDLARPSGLLAQER 

KLCRDLVHSNKKEQEFRSIFQHIQSAQSQRSP 

SELFAQHIVTIVHHVKEHHFGSSGMTLHERFT 

KYLKRGTEQEAAKNKKSPEIHRRIDISPSTFRK 

HGLAHDEMKSPREPGYKAEGKYKDDPVDLR 

U5IERRKJCHKERDLKRGKSRESVDSRDSSHSR 

ERSAEKTEKTHKGSKKQKKHRRARDRSRSSS 

SSSQSSHSYKAEEYTEETEEREESTTGFDKSRL 
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D=Aspartic Acid, E=Glutamic Acid, 
Phenylalanine;, GKjlycine* H=Histidine, 
Msoleucine, K^Lysine, D=Leucine, 
M^Methionine, N=Asparagine, P=Proline, 
Q=Glutamine, R=Arginine, S^Serine, 
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Y=Tyrosine, X=Unknown, *=Stop codon, 
/=possible nucleotide deletion, Visible 
nucleotide insertion 














GTKDFVGPSERGGGRARGTFQFRARGRGWG 
RGNYSGNNNNNSNNDFQKRNREEEWDPEYT 
PKSKKYYLHDDREGEGSDKWVSRGRGRGAF 
PRGRGRFMFRKSSTSPKWAHDKFSGEEGEIE 
DDESGTENREEKDNIQPTTE 


997 


2347 


A 


8398 


202 


552 


CPALGGRQDLQGTRLLWAHDSGVGGQKAKS 
KQENLESLEATGREEEGGQGPPVTTKGVLLA 
LLMAGLALQPGTALLCYSCKAQVSNEDCLQ 
VENCTQLGEQCWTARIREWGDDSRQA 


998 


2348 


A 


8400 


697 


301 


NPPSACTPGSCDSCSGRGRDLAFDSVWSTNN 
MSDPRRPNKVLRYKPPPSECNPALDDPTPDY 
MNLLGMIF SMCGLMLKLK W C A WVA V YCSF1 
SFANSRSSEDTKQMMSSFMLSISAWMSYLQ 
NPQPMTPPW 




2349 


A 


8401 


93 


1126 ' 


AS ASHITSGHLRCFPG SEG VGTMARCFSLVLL 

LTSIWTTRLLVQGSLRAEELSIQVSCRIMGITL 

VSKKANQQLNFTEAKEACRLLGLSLAGKDQ 

VK1ALKASFETCSYGWVGDGFVVISRISPNPK 

CGKNGVGVLIWKVPVSRQFAAYCYNSSDTW 

TNSCIPEIITTKDPIFNTQTATQTTEFIVSDSTYS 

VASPYSTIPAPTTTPPAPASTSIPRRKKLICVTE 

VFMETSTMSTETEPF VENKAAFKNEAAGFG G 

VPTALLVLALLFFGAAAGLGFCYVKRYVKAF 

PFmKNQQKEMIETKVVKEEKANDSNPNEES 

KKTDKNPEESKSPSKTTMRCLEAEV 


1000 


2350 


A 


8406 


2 


777 


KERCQFVVKPMLSTVGSFLQDLQNEDKGIKT 
AAIFTAIXjNMISASTLMDILLMNDFKLVINKI 
AYDVQCPKREKPSNEKTAEMEHMKSUTHRL 
FTILHLEESQKKREHHLLEKIDHLKEQLQPLE 
QVKAGIEAHSEAKTSGLLWAGLALLSIQGGA 
LAWLTWWVYSWDIMEPVTYFITFANSMVFF 
AYFIVTRQDYTYSAVKSRQFLQFFHKKSKQQ 
HFDVQQYNKLKEDLAKAKESLKQARHSLCL 
QMQVEELNEKN 


1001 


2351 


A 


8410 


1400 


264 

- 


VGFWERPLRSSRWFRRSLRRWEMLARAARG 

TGALLLRGSLLASGRAPRRASSGLPRNTVVLF 

VPQQEAWVVERMGRFHRILEPGLNILIPVLDR 

IRYVQSLKEIVINVPEQSAVTLDNVTLQIDGV 

LYLRIMDPYKASYGVEDPEYAVTQLAQTTM 

RSELGKLSLDKVFRERESLNASIVDAINQAAD 

CWGIRCLRYEIKDIHVPPRVKESMQMQVEAE 

RRKRATVLESEGTRESA1NVAEGKKQAQ1LAS 

EAEKAEQ1NQAAGEASAVLAKAKAKAEAIRI 

LAAALTQHNGDAAASLTVAEQYVSAFSKJLA 

KDSN1ULPSNPGDVTSMVAQAMGVYGALT 

KAPVPGTPDSLSSGSSRDVQGTDASLDEELDR 

VKMS 


1002 


2352 


A 


8421 


134 


941 


NRENIXESRMMDPCSVGVQLRTTNECHKTY 

YTRHTGFKTLQELS SNDMLLLQLRTGMTLSG 

NNTICFHHVKYIDRFEDLQKSCCDPFNIHKKL 

AKKNLHVIDLDDATFLSAKFGRQLVPGWKLC 

PKCTQUNGSVDVDTEDRQKRKPESDGRTAK 

ALRSLQFTNPGRQTEFAPETGKREKRRLTKN 

ATAGSDRQVIPAKSKVYDSQGLLIFSGMDLC 

DCLDEDCLGCTYACPACX3STKCGAECRCDRK 

WLYEQIEIEGGEIIHNKHAG 


1003 


2353 


A 


8427 


3 


1416 


TEWGLSGSCPGC SPLEPGSRGRGAAAWRILR 

CRRLPEPSPFLTQPNLAQSQPPAPVPVTDPSVT 

MHPAVFLSLPDLRCSLLLLVTWVFTPVTTEIT 



291 



WO 01/57188 



PCT/US01/03800 



SEQO) 
NO: of 
nucl- 
eotide 
seq- 
uence 


SEQ ID 
NO: of 
peptide 
seq- 
uence 


Met 
hod 


SEQ 
ID NO: 
in 

USSN 
09/496 
914 


Predicted 

beginning 

nucleotide 

location 

correspond! 

ng to first 

amino acid 

residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue 
of peptide 
sequence 


Amino acid sequence (A»Alanine OCysteine, ^ 
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SLDTENnDEn^NABVALYNF YAD WC? D< ? A V"' 

LHPIFEEASDVIKEEFI^NQVVFARTOCDQH 

SDIAQRyiUSKTVFILKIJFKNGMMMKREY^^ 

RSVKALADYIRQQKSDPIQEIRDLAEITTLDRS 

KRNIIGYFEQKI)SD>rniVFERVANIIJHDDDC^ 

LSAFGDVSKPERYSGDNIIYKPPGHSAPDMVY 

LGAMTNFDVTYNWIQDKCVPLVREITFENGE 

ELTEEGLPFULFIIMKEDTESLEIFQNEVARQL 

ISEKGTINFLHADCDKFRHPLLHIQKTPADCP 

VIA1DSFRHMYVFGDFKDVLIPGKLKQFVFDL 

HSGKLHREFHHGPDPTOTAPGEQAQDVASSP 

PESSFQKLAPSEYRYTLLRDRDEL 


1004 


2354. 


A 


8432 


910 


387 


GLSRKLRAGFLPGFCRVSPCGSWVVETLVKM 

ACAAARSPADQDRHCIYPAYLNNKKTIAEGR 

RIPISKAVENPTATEIQDVCSAVGLNVFLEKN 

KMYSREWNRDVQYRGRVRVQLKQEDGSLC 

LVQFPSRKSVMLYAAEM1PKLKTRTQKTGGA 

DQSLQQGEGSKKGKGKKKK 


1005 


2355 


A 


8453 


90 


530 


QSHETKMQSGTHWRVLGLCLLSVGVWGQD 

GNEEMGGrTQTPYKVSISGTTVILTCPQYPGSE 

1LWQHNDKNIGGDEDDKNIGSDEDHLSLKEF 

SELEQSGYYVCYPRGSKPEDANFYLYLRARG 

NPGLQNRYHRLFREDHSKGHSQ 


1006 


2356 


A 


8458 


3 


307 


AVQRIRHEMNIFRLTGDLSHLAAIVILLLKIW 
KTRSCAGISGKSQLLFALVFTTRYLDLFTSFIS 
LYNTSMKVWYAIHRNVFI-ILQCTGLWTLNLC 
QLCIFN 


1007 


2357 


A 


8459 


43 


553 


GAGAGGDWAAMDKLKKVLSGQDTBDRSGL . 

SEWEASSLSWSTRIKGFIACFAIGILCSLLGT 

VLLWVPRKGLHLFAVFYTFGNIASIGSTEFLM 

GPVKQLKRMFEPTRLtATIMVLLCFALTLCSA 

FWWHNKGLALIFCILQSLALTWYSLSFIPFAR 

DAVKKCFAVCLA 


1008 


2358 


A 


8462 


487 


150 


AQDIRSVHSLGQKSTJ^VKHFRTLSHtkGLPDP 
PPHWPPQERSPPSHPCMPSHRPQIPQLSNSGPS 
DPRWGCVGPSMPTSTCLPGAVEASTTKASLP 
KCPVDSSLPTPEACFL 


1009 


2359 


A 


8465 


134 


954 


ETRVKTSLELLRTQLEPTGTVGNTIMTSQPVP 
NETIIVLPSNVINFSQAEKPEPTNQGQDSLKKH 
LHAEIKVIGTIQILCGMMVLSLGIILASASFSPN 
FTQVTSTLLNSAYPFIGPFFFIISGSLSIATEKRL 
TKLLVHSSLVGSILSALSALVGFIILSVKQATL 
NPASLQCELDKNNIPTRSYVSYFYHDSLYTTD 
CYTAKASLAGTLSLMUCTLLEFCLAVL1AVL 
RWKQAYSDFPGSVLFLPHSYIGNSGMSSKMT 
HDCGYEELLTS 


1010 


2360 


A 


8468 


2 


473 


KYRYRRPYPVMRKICQVGPAGLAFILNISPVA 
HRVALCHLAGCQEQAAWYHTLQILFFLVSAY 
FFSCPVPEKYFPGSCDIVGHGHQlFHAFLSICT 
LSQLEA1LLDYQGRQEIFLQRHGPLSVHMACL 
SFFFLAACSMTAALLRHK VKARLTKKD S 


1011 


2361 


A 


8478 


5 


409 


TELSQLEKAHPPADMGRRKSKRKPPPKKKMT 
GTLETQFTCPFCNHEKSCDVKMDRARNTGVI 
SCTVCLEEFQTPITCILGNLGFFQRVGRGLESG 
PCSSGPLCALVQGQSRPEEQVPPSDFCGVRRC 
RAGFQCQ 


Ml 


2362 


A 


8481 


2810 


1652 


RTSTQKWQSVENDSQEHLERFYCNPENDRM 
RMKYGGQEFWADLNAMNVYETTEFDQLRR 
LSTPPSSNVNSIYHTVWKFFCRDHFGWREYPE 



292 



WO 01/57188 



PCTAJS01/03800 



SBOlD 
NO: of 
nucl- 
eotide 
seq- 
uence 


SEQID 
NO: of 
peptide 
seq- 
uence 


Met 
hod 


SEQ 
ID NO: 
in 

USSN 
09/496 
914 


Predicted 

beginning 

nucleotide 

location 

correspondi 

ng to first 

amino acid 

residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue 
of peptide 
sequence 


Amino acid sequence (A=Alanine OCysteine, 
D^Aspartic Acid, E-Glutamic Acid, 
^Phenylalanine, Glycine, H=Histidine, 
Msoleucine, KHLysine, D=Leucine, 
M^Methionine, N=Asparagine, P=Proline, 
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• 


SVIRLIEEANSRGLKEVRFMMWNNHYILHNS 

FFRREIKRRPLFRSCFILLPYLQTLGGVPTQAP 

PPLEATSSSQnCPDGVTSANr^ETWVYMHP 

SQDFIQVPVSAEDKSYRHYNLFHKTVPEFKYR 

ILQILRVQNQFLWEKYKRKKEYMNRKMFGR 

DRI1NERHLFHGTSQDVVDGICKHNFDPRVCG 

KHATMFGQGSYFAKXASYSHNFSKKSSKGV 

HFMFLAKVLTGRYTMGSHGMRRPPPVNPGS 

VTSDLYDSCVDNFFEPQIFVTFNDDQSYPYFVI 

QYEEVSNTVSI 


1013 


2363 


A 


8488 


2 


517 


1ENCRTRLRQAWHEVCGNKMAAPIPQGFSCL 
SRi^GWWFRQPVLVTQSAAIWVRTKJKJlFIP 
P1YQPKFKTEKEFMQHARKAGLV1PPEKSDRS 
IHLACTAGIFDAYVPPEGDARISSLSKEGUER 
TERMKKTMASQVSIRRJKDYDANFKKDFPE 
KAKDIFIEGSPLY 


1014 


2364 


A 


8501 


363 


17 


mTGYVYICIIYAQLMYTYYIRTAYVYICILY 
AQLMYTYVLYTHSLCIHMYSIRTAYVY1CUY 
AQIMYTYVFYTHRLCIHMYSIRTDYVYICILY 
AQLMYTYVFYTHSYMSDE 


1015 


2365 


A 


8504 


3 


2190 


NSSEHFSQAPQRLSFYSWYGSARLFRFRVPPD 

AVLLRWLLQVSRESGAACTDAEITVHFRSGA 

PPVINPLGTSFPDDTAVQPSFQVGVPLSTIPRS 

NAS VN VSHPAPGDWF VAAHLPPSSQK1EJLKG 

LAPTCAYVFQPELLVTRVVEISIMEPDVPLPQ 

TLLSHPSYLKVFVPDYTRELLLELRDCVSNGS 

LGCPVRLTVGPVTLPSNFQKVLTCTGAPWPC 

RLLLPSPPWDRWLQVTAESLVGPLGTVAFSA 

VAALTACRPRSVTIQPLLQSSQNQSFNASSGL 

LSPSPDHQDLGRSGRVDRSPFCLTNYPVTRED 

MDWSVHFQPLDRVSVRVCSDTPSVMRLRL 

NTGMDSGGSLTISLRANKTEMRNETVWACV 

NAASPFLGFNTSLNCTTAFFQGYPLSLSAWSR 

RANLIIPYPETDNWYLSLQLMCPENAEDCEQ 

AVYHVETTLYLVPCLNDCGPYGQCLLLRRHS 

YLYASCSGKAGWRGWSCTDNSTAQTVAQQR 

AATLLLTLSNLMFLAPIAVSVRRKFLVEASVY 

AYTMFFSTFYHACDQPGEAVLCILSYDTLQY 

CDFLGSGAMWVmCMARLKTVLKYVLFLL 

GTLVIAMSLQLDRRGMWNMLGPCLFAFVTM 

ASMWAYRCGHRRQCYPTSWQRWAFYLLPG 

VSMASVGlAlYTSMMTSD>TYYYTHSrWHILL 

AGS AALLLPPPDQP AEPWACSQKFPCHY QIC 

KNDREELYAVT 


1016 


2366 


A 


8511 


1 


453 


K.WYPSOPVRIPGRFYYKLPAGHRRCRMAPAK 

KGGEKKKGRSAINEVVTREYTINIHKRIHGVG 

FKKRAPRALKEIRKFAMKEMGTPDVRIDTRL 

NKAVWAKGIRNVPYRIRVRLSRKRNEDEDSP 

NKLYTLVTYVPVTTFKNLQTVNVDEN 


1017 


2367 


A 


8513 


54 


1196 


LERTPASADMAWTKYQLFLAGLMLVTGSINT 

LSAKWADNFMAEGCGGSKEHSFQHPFLQAV 

GMFLGEFSCLAAFYLLRCRAAGQSDSSVDPQ 

QPFNPLLFLPPALCDMTGTSLMYVALNMTSA 

SSFQMLRGAVIIFTGLFSVAFLGRRLVLSQWL 

GILATIAGLVWGLADLLSKIIDSQI-IKLSEV1T 

GDLLnMAQIlVAIQMVLEEKFVYKHNVHPLR 

AVGTEGLFGFVn.SLLLWMYYIPAGSFSGWP 

RGTLED ALDAFCQVGQQPLIAVALLGNIS SIA 

FF^AGISVTKELSATTRMVLDSLRTVVIWAL 
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StALGV^AMAl^lLGFUL^ 

PIJLGRLSRORPLAEESEQEIU.LGGTRTPINDA 

S 


1018 


2368 


A 


8518 


324 


694 


SPFWTEKRRMEKPLFPLVPLHWFGFGYTALV 

VSGGIVGYVKTGSVPSLAAGLLFGSLAGLGA 

YQLYQDPRNVWGFLAATSVTFVGVMGMRS 

VYYGmffVGLIAGASU-MAAiCVGVRMLM 

TSD 


1019 


2369 


A 


8526 


2 


1787 


VSAAAVNMEPPDAPAQARGAPRLLLLAVLL 

AAHPDAQAEVRLSVPPLVEVMRGKSVILDCT 

PTGTHDHYMLEWFLTDRSGARPRLASAEMQ 

GSELQVTMHDTRGRSPPYQLDSQGRLVLAEA 

QVGDERDYVCVVRAGAAGTAEAAARLNVF 

AKPEATEVSPNKGTLSVMEDSAQEIATSNSRN 

GNPAPKITWYRNGQRLEVPVEMNPEGYMTS 

RTVREASGLLSLTSTLYLRLRKDDRDASFHC 

AAHYSLPEGRHGRLDSPTFHLTLHYPTEHVQ 

FWVGSPSTPAGWVREGD'I'VQLLCRGIXjSPSP 

EYTLFRLQDEQEEVLNVNLEGNLTLEGVTRG 

QSGTYGCRVEDYDAADDVQLSKTLELRVAY 

LDPLELSEGKVLSLPLNSRAWNCSVHGLPTP 

ALRWTKDSTPLGDGPMLSLSS1TFDSNGTYVC 

EASLPTVPVLSRTQNFTLLVQGSPELKTAEIEP 

KADGSWREGDEVTLICSARGHPDPKLSWSQL 

GGSPAEPEPGRQGWVSSSLTLKVTSALSRDGI 

SCEASNPHGNKRHVFHFGTVSPQTSQAGVAV 

MAVAVSVGLLLLVVAVFYCVRRKGGPCCRQ 

RREKGAP 


1020 


2370 


A 


8530 


2 


1200 


PRVRLLRPSRSRSCRGLLSTRAPGPSPFRSLHS 

SPLLPHAMKSPFYRCQNTTSVEKGNSAVMGG 

VLFSTGLLGNLLALGLLARSGLGWCSRRPLR 

PLPSVFYMLVCGLTVTDLLGKCLLSPWLAA 

YAQNRSLRVLAPALDNSLCQAFAFFMSFFGL 

SSTLQLLANLALECWLSLGHPFFYRRHITLRLG 

ALVAPVVSAFSLAFCALPFMGFGKFVQYCPG 

TWCFIQMVHEEGSLSVLGYSVLYSSLMALLV 

LATVLCNLGAMRNLYAMHRRLQRHPRSCTR 

DGAEPRADGRBASPQPLEELDHLLLLALMTV 

LFTMCSLPVIYRAYYGAFKDVKEKNRTSEEA 

EDLRALRFLSVISIVDPWIFirFRSPVFRIFFHKI 

FIRPLRYRSRCSNSTNMESSL 


1021 


2371 


A 


8536 


1 


237 


RRGEIDMATEGDVELELETETSGPERPPBKPR 
KHDSGAADLERVTDYAEEKEIQSSNLETAMS 
VIGDRRSREQKAICQER , 


1022 


2372 


A 


8537 


94 


541 


RKERRRRRRRMEAVVFVFSLLDCCALIFLSV 
YFUTLSDLECDYINARSCCSKLNKWVIPELIG 
HT1VTVLLLMSLHWFIFLLNLPVATWNIYRYI 
MVPSGNMGVFDPTEIHNRGQLKSHMKEAMI 

TTt OFHT I iTFMVT VQMT1 AT TXTTV 


1023 


2373 


A 


8540 


26 


431 


RMMKCPQALLAIFWLLLSWVSSEDKWQSPL 
SLWHEGDTVTLNCSYEVTNFRSLLWYKQEK 
KAPTFLFMLTSSGIEKKSGRLSSILDKKELSSIL 
NITATQTGDSAIYLCAVEAQCSLVTCSLYSNS 
TAEALQL 


1024 


2374 


A 


8544 


1731 


743 


GVRLRYSPIAVVMVOEAGRDLRRRRAVAVT 

AEKMAVLAPLIALVYSVPRLSRWLAQPYYLL 

SALLSAAFLLVRKLPPLCHGLPTQREDGNPCD 

FDWREVEILMFLSAIVMMKNRRSITVEQHIGN 

IFMFSKVANTILFPRLDIRMGLLYraCfVFLM 
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TCKPPLYMGPEYIKYFNDKTIDEELERDKRVT 

WIVEFFANWSNDCQSFAPIYADL SLKYNCTG 

LNFGKVDVGRYTDVSTRYKVSTSPLTKQLPT 

LILFQGGKEAMRRPQIDKKGRAVSWTFSEEN 

VIREFNLNELYQRAKKLSKAGDN1PEEQPVAS 

TPTTVSDGENKKDK 


1025 


2375 


A 


8546 


2194 


1707 


TVSFHKTMASLKCSTWCVICLEKPKYRCPA 
CRVPYCSWCFRKHKBQCNPETRPVEKKIRS 
ALPTKTVKPVENKDDDDSIADFLNSDEEEDR 
VSLQNLKNLGESATLRSLLLNPHLRQLMVNL 
DQGEDKAKLMRAYMQEPLFVEFADCCLGIV 
EPSQNEES 


1024 


2376* 


A 


8547 


1078 


594 


V(SMfiL3PAVNLlK:VlLLGH\VLLTTWCCIVFSGS 

YAWANFTILALGVWAVAQRDSIDAISMFLGG 

LLATIFLDIVHISIFYPRVSLTDTGRFGVGMAIL 

SLLLKPLSCCFVYHMYRERGGELLVP1TGFLG 

SSQDRSAYQTIDSAEAPADPFAVPEGRSQDAR 

GY 


1027 


2377 


A 


8557 


i 


340 


DFLGPASPQEEGGSESSTMTELETAMGMIIDV 
FSRYSGSEGSTQTLTKGELKVLMEKELPGFLQ 
SGKDKDAVDKLLKDLDANGDAQVDFSEFIVF 
VAATTSACHKYFEKAGLK 


1028 


2378 


A 


8569 


20 


963 


KMAATLGPLGSWQQWRRCLSARDGSRRLLL 

LLLLGSGQGPQQVGAGQTFEYLKREHSLSKP 

YQGEAPRPCFLRBWELQVHFKIHGQGKKNL 

HGDGLAIWYTKDRMQPGPVFGNMDKFVGLG 

VFVDTYPNEEKQQERVFPYISANWINGSLSY 

DHERDGRPTELGGCTAIVRNLHYDTFLVIRY 

VKRHLmiMDIIX5KHEWRDCIEVPGVRLPRG 

YYFGTSSITGDLSDNHDVISLKLFELTVERTPE 

EEKLHRDVFLPSVDNMKLPEMTAPLPPLSGL 

ALFLIVFFSLVFSVFAIVIGIILYNKWQEQSRK 

RFY 


1029 


2379 


A 


8572 




578 


AAAASHRSRARSRPRRVSSGPAPRRAQSSAG 
RVASGLDSAPLCTMARALCRLPRRGLWLLLA 
HHLFMTTACQEANY GALLRELCLTQFQ VDM 
EAVGETLWCDWGRTIRSYRELADCTWHMAE 
KLGCFWPNAE VDRFFL AVHGRYFRS CP1 SGR 
AVRDPPGSILYPFIWPITVTLLVTALVVWQS 
KRTEGIV i 


1030 


2380 


A 


8574 


1352 


372 


DSSTVKGGSESRHLCLIPDLKGKARTREASSG 

SRTCGRRTSLCTSAKSSWTYRSGRLSWQSIKG 

THLTITQALRQPLHRAPLLPGQLCWSPRPLEK 

NKAMGRPLLLPLLLLLQPPAFLQPGGSTGSGP 

SYLYGVTQPKHLSASMGGSVEIPFSFYYPWEL 

AIVFNVRISWRRGHFHGQSFYSTRPPSIHKDY 

VNRLFLNWTEGQESGFLRISNLRKEDQSVYF 

CRVELDTRRSGRQQLQSDCGTKLTITQAVTTT 

TTWRPSSTTTIAGLRVTESKGHSESWHLSLDT 

AIRVALAVAVLKTVDLGLLCLLLLWWRRRKG 

SRAPSSDF 


1031 


2381 


A 


8580 


905 


340 


RRTAGIYPCFPKPGRTRHALCSWLLLLTGQL 

AFDDFQESCAMMWQKYAGSRRSMPLGARIL 

FHGVFYAGGFAIVYYLIQKFHSRALYYKLAV 

EQLQSHPEAQEALGPPLNIHYLKL1DRENFVDI 

VDAKLKIPVSGSKSEGLLYVHSSRGGPFQRW 

HLDEVFLELKDGQQIPVFKLSGENGDEVKKE 


1032 


2382 


A 


8593 


2558 


961 


RRRPRLLPGAEPCEPRVGPRRADMGCSAKAR 
WAAGALGVAGLLCAVLGAVMIVMVPSLUCQ 



295 



WO 01/57188 PCT/US01/03800 



SEQ ID 
NO: of 
nucl- 
eotide 
seq- 
uence 


SEQ ID 
NO: of 
peptide 
seq- 
uence 


Met 
hod 


SEQ 
ID NO: 
in 

USSN 
09/496 
914 


Predicted 
beginning 
nucleotide 
location 
correspond i 
ng to first 
amino acid 
residue of 
peptide 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue 
of peptide 
sequence 


Amino acid sequence (A^Alanine OCysteine, 
D=Aspartic Acid, EK31utarnic Acid, 
F=Phenylalanine, G-Glycine, H^Histidine, 
Hsoleudne, K=»Lysine, L-Leucine, 
M=Memionine, N=Asparagine, PHProline, 
Q=<31utamine, R^Arginine, S=Serine, 
^Threonine, V=VaIine, W-Tryptophan, 
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n YLK2^/RnDPSSLS!^^E!?IPF^ 

VMNPSEJLKGEKPQVRERGPYVYREFRHKSNI 

TFNNNDTVSFLEYRTFQFQPSKSHOSESDYIV 

MPNILVLGMVMMENKPMTLKLIMTLAFTTL 

GERAFMNRTVGEIMWGYKDPLmiNKYFP 

GMFPFKDKFOLFAELNNSDSGLFTGFTGVQNI 

SRIHLVDKWNGLSKVDFWHSDQCNMINGTS 

GQMWFPFMTPESSLEFYSFKACKSMKJLMYKE 

SGVFEGIPrYRFVAPKTLFANGSIYPPNEGFCP 

CLESGIQNVSTCRFSAPLFLSHPHFLNADPVL 

AEAVTGLHFNQEAHSLFLDIHPVTGIPMNCSV 

KLQLSLYMKSVAGlGQTGfCIEPWLPLLWFA 

ESGAMEGETLHTFYTQLVLMPKVMHYAQYV 

LLALGCVLLLVPVICQIRSQEKCYLFWSSSKK 

GSKDKEAIQAYSESLMTSAPKGSVLQEAKL 


1033 


2383 


A 


8595 


595 


767 


AHLPDTLLLPPHSPTVPTPKSFQCSQKACFSRS 
FCLLLSLVSSSLVSLSLCPPLTQA 


1034 


2384 


A 


8597 


640 


164 


VTTSCDPFAFGLGVRASERLAEIDMPYLLKYQ 
PMMQTIGQKYCMDPAVIAGVLSRKSPGDKJL 
VNMGDRTSMVQDPGSQAPTSWISESQVFQTT 
EVLTTRITELQRRFPTWTPDQYLRGGLCAYSG 
GAGYVRSSQDLSGDFCNDVLARAKYLKRHG 
F 


1035 


2385 


A 


8603 


936 


204 


AMASTLEYSPSPLRRLVGPAAGFSRAARADL 
SWDPMAFFTGLWGPFTCVSRVLSHHCFSTTG 
SLSAIQKMTRVRWDNSALGNSPYHRAPRCI 
HVYKKNGVGKVGDQILLADCGQKKKALIVG 
HCMPGPRMTPRFDSNNVVLIEDNGNPVGTRI 
KTPIPTSLRKREGEYSKVLAIAQNFV 


1036 


2386 


A 


8606 




562 


PTRAHSFDLCCSPCRJRJRLLGREEAGEEPTSPV 

TQYLQPRSPEECKMFACAKLACTPSLIRAGSR 

VAYRPISASVLSRPEASRTGEGSTVFNGAQNG 

VSQLIQREFQTSAISRDIDTAAKFIGAOAATVG 

VAGSGAGIGTVFGSLIIGYAR>iPSLKQQLFSY 

AJDLGFALSEAMGLFCLMVAFLILFAM 


1037 


2387 


A 


8615 


2 


2364 


SPGPSLPESAESLDGSQEDKPRGSCAEPTFTDT 

GMVAHINNSRLKAKGVGQHDNAQNFGNQSF 

EELRAACLRKGELFEDPLFPAEPSSLGFKDLG 

PNSKNVQNISWQRPKDIINNPLFIMDGISPTDI 

CQGILGDCWLLAAIGSLTTCPKLLYRWPRG 

QSFKKNYAGIFHFQIWQFGQWVNVVVDDRL 

PTKNDKLVFVHSTERSEFWSALLEKAYAKLS 

GSYEALS GGSTMEGLEDFTGGVAQSFQLQRP 

PQNLLRLLRKAVERSSLMGCSIEVTSDSELES 

MTDKMLWGHAYSVTGLQDVHYRGKMETLI 

RVRNPWGRIEWNGAWSDSAREWEEVASDIQ 

MQLLHKTEDGEFWMSYQDFLNNFTLLEICNL 

TPDTLSGDYKSYWHTTFYEGSWRTGSSAGGC 

RNHPGTFWTNPQFKISLPEGDDPEDDAEGNV 

WCTCLVALMQKNWRHARQQGAQLQTIGFV 

LYAVPKEFQNIQDVHLKKEFFTKYQDHGFSEI 

FTNSREVSSQLRLPPGEYHIPSTFEPHRDADFL 

LRVFTEKHSBSWELDEVNYAEQLQEEKVSED 

DMDQDFLHLFKIVAGEGKEIGVYELQRLLNR 

MADCFKSFKTKGFGLDACRCMINLMDKDGSG 

KLGLl^FKILWKKLKKWMDIFRECDQDHSGT 

LNSYEMRLVIEKAGIKLNNKVMQVLVARYA 

DDDLIIDFDSFISCFLRLKTMFTFFLTMDPKNT 

GHICLSLEQVLGEGWEGICRIAPACPSTPPPPS 
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SDVPGPASCPRJLFPPWDLLPVSTVAADDHVGI 
EAL 


1038 


2388 


A 


8&1 


3 


1494 


RSRMARAPLGVLLLLGLLGRGVGKNEELRLV 

HHLFNNYDPGSRPVREPEIJIVnSLKVTLTO 

ISLNEKJEETLTTSVWIGIDWQDYKLNYSKDDF 

GGIETLRVPSELVWLPETVLENNIDGQFGVAY 

DANVLVYEGGSVTWLPPAIYRSVCAVEVTYF 

PFDWQNCSLIFRSQTYNAEEVEFTFAVDNDG 

KTINKIDIDTEAYTENGEWAIDFCPGVTRRHH 

GGATDGPGETDVIYSLIIRRKPLFYVINnVPCV 

LISGLVLLAYFLPAQAGGQKCTVSINVLLAQT 

VFLFLIAQiaPETSLSVPU-GRFLIFVMWATLI 

VMNCVIVLNVSQRTPTTHAMSPRLRHVLLEL 

LPRLLGSPPPPEAPRAASPPRRASSVGLLLRAE 

ELILKKPRS EL VFEG QRHRQGTWTAAFCQ SL 

GAAAPEVRCCVDAVNFVAESTRDQEATGEE 

VSDWVRMGNALDNICFWAALVLFSVGSSLIF 

LGAYFNRVPDLPYAPCIQP 


1039 


2389 


A 


8636 


1 


900 


PGRERPGGGGARRRPQHLPALLPSERPDCATL 

QAMENELPVPHTSSSACATSSTSGASSSSGCN 

NSSSGGSGRPTGPQISVYSGIPDRQTVQVIQQ 

ALHRQP STAAQ YLQQMYAAQQQHLMLQTA 

ALQQQHLSSAQLQSLAAVQQASLVSNRQGST 

SGSNVSAQAPAQSSSJNLAASPAAAQLLNRA 

QSVNS AAASGIAQQAVLLGNTS SPALTASQA 

QMYLRAQMLIFTPTATVATVQPELGTGSPAR 

PPTPAQVQNLTLRTQQTPAAAASGPTPTQPVL 

PSLALKPTPGGSQPLPTPA 


1040 


2390 


A 


8645 


98 


1388 


ASQLAFGGKLTSTPSRDFQGCGRGAVTCCSF 

HEHRHQSGRCLSTGMAPNLKGRPRKKKPCPQ 

RRDSFSGVKDSNNNSDGKAVAKVKCEARSA 

LTKPKNNHNCKKV SNEEKPKVAIGEECRADE 

QAFLVALYKYMKERKTPIERIPYLGFKQINLW 

TMFQAAQKLGGYETTTARRQWKHIYDELGG 

NPGSTSAATCTRRHYERLILPYERFIKGEEDKP 

LPPIKPRKQENSSQENENKTKVSGTKRHCHEIP 

KSKKEKEN APKPQD AAEVS SEQEKEQETLISQ 

KSIPEPLPAADMKKKIEGYQEFSAKPLASRVD 

PEKDNETDQGSNSEKVAEEAGEKGPTPPLPSA 

PLAPEKDSALVPGASKQPLTSPSALVDSKQES 

KLCCFTESPESEPQEASFPRLPHHTGHRWQTR 

MRRRMTNCPPWQITLPTAP 


1041 


2391 


A 


8646 


113 


1492 


LLQEMCTKTCPVLWGCFLLWNLYVSSSQTIYP 

GnCARITQRALDYGVQAGMKMIEQMLKEKK 

LPDLSGSESLEFLKVDYVNYNFSNIKISAFSFP 

NTSLAFVPGVGDCALTNHGTANISTDWGFESP 

LFVLYNSFAEPMEKPILKNLNEMLCPHASEVK 

ALNANLSTLEVLTKIDNYTLLDYSLISSPEITE 

NYLDLNLKGVFYPLENLTDPPFSPVPFVLPER 

SNSMLYIGIAEYFFKSASFAHFTAGVFNVTLS 

TEEISNHFVQNSQGLGNVLSRIAEIYILSQPFM 

VRJMATEPPIINLQPGNFTLDIPASIMMLTQPK 

NSTVETIVSMDFVASTSVGLVILGQRLVCSLS 

LNRFRLALPESNRSNIEVLRFENILSSILHFGVL 

PLANAKLQQGFPLPNPHICFLFVNSDIEVLEGF 

LLISTDLKYETSSKQQPSFHVWEGLNLISRQW 

RGKSAP 


1042 


2392 


A 


8672 


538 


170 


ARRIARTRESKAAVSQDNVPALQPGKJCKKLR 
1X5GKKKKFKFFRLPKEFKKQLMYSPSNFKKM 
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SEQID 
NO: of 
nucl- 
eotide 
seq- 
uence 


SEQID 
NO; of 
peptide 
seq- 
uence 


Met 
hod 


SEQ 
ID NO; 
in 

USSN 
09/496 
914 


Predicted 

hf»mnninD 

nucleotide 
location 
corrcspondi 
ng to first 
amino acid 
residue of 
peptide 
sequence 


Predicted end 

JJUCJC0UU9 

location 
corresponding 
to last amino 

of peptide 
sequence 


Amino acid sequence (A=Alanine Cysteine, 
D = Aspartic Acid, E=GIutamicAcid, 
F-Phenylalanine, G=Glycine, H=Histidine, 
Msoleucine, K^Lysine, Leucine, 
M«Methionine, N=Asparagine, P^Proline, 
v^vj juioiiuuc, j\— '/\r g mine, o— oenne, 
T=Threonine J V-Valine, W^Tryptophan, 
Y=Tyrosine, X^Unknown, *=Stop codon, 
/^possible nucleotide deletion, \=possible 
nucleotide insertion 














TSLAGNTVQCLNKLKWIYSAQYPAYGNITT 
LDMiTSTDHVLEQDFWICFTFYSVKERQI 


1043 


2393 


A 


8688 


359 


17 


GLKTRAPAXPTTQREVLGPAKQDMQRRCPRI " 
GLMTSLLKPOCRRWRDYKRWKSGGFTGESC 
HHADTLGDRGGLQGDHSELLQWQKRILRTE 
GEPSPKYISKNIFPICSYITGFL 


1044 


2394 


A 


8718 


292 


1490 


GTVKTSVATPITAGHSCSSGOVLQVKSPATQS" 

GFKFTSKMEDFNMESDSF^FWKGEDLSNYS 

YSSTLPPFLLDAAPCEPESLEINKYFWIIYAL 

VFLLSLLGNSLVMLVHYSRVGRSVTDVYLL 

NLALADLLFALTLPIWAASKVNGWIFGTFLC 

KVVSLLKEVNFYSGILLLACISVDRYLAIVHA 

TRTL'IQKRYLVKFICLSIWGLSLLLALPVLLFR 

RTVYSSNVSPACYEDMGNNTANWRMLLRIL 

PQSFGFIVPLLIMLFCYGFTLRTLFKAHMGQK 

HRAMRVIFAWLIFLLCWLPYNLVLLADTLM 

RTQVIQETCERRNHIDRALDATE1LGILHSCLN 

PLIYAFIGQKFRHGLLKILAIHGLISKDSLPKDS 

RPSFVGSSSGHTSTTL 


1045 


2395 


A 


8724 


254 


3184 


FRANLAITVANRRGAQGGkMHTCCPPVTLEQ 

DLHRKMHSWMLQTLAFAVTSLVLSCAETIDY 

YGEICDNACPCEEKDGILTYSCENRGIISLSEIS 

PPRFPIYHLLLSGNLLNRLYPNEFVNYTGASIL 

HLGSNVIQDIETGAFHGLRGLRRLHLNNNKL 

ELLRDDTFLGLENLEYLQVDYNYISVTEPNAF 

GKLHLLQVLILNDNLLSSLPNNLFRFVPLTHL 

DLRGNRLKLLPYVGLLQHMDKWELQLEEN 

PWNCSCELISLKDWLDSISYSALVGDVVCETP 

FRLHGRDLDEVSKQELCPRRLISDYEMRPQTP 

LSTTGYLHTTPASVNS\ f ATSSSAVYKPPLKPP 

KGTRQPNKPRVRPTSRQPSKDLGYSNYGPSIA 

YQTKSPVPLECPTACSCNLQISDLGLNVNCQE 

RKIESIAELQPKPYNPKKMYLTENYIAVVRRT 

DLLEATGLDLLHLGNNRISM1QDRAFGDLTN 

LRRLYLNGNRIERLSPELFYGLQSLQYLFLQY 

NLIREIQSGTFDPVPNLQLLFLNNNLLQAMPS 

GVFSGLTLLRLNLRSNHFTSLPVSGVLDQLKS 

LIQIDLHDNPWDCTCDIVGMKLWVEQLKVG 

VLVDEVICKAPKKFAETDMRSIKSELLCPDYS 

DVVVSTPTPSSIQVPARTSAVTPAVRLKSTGA 

PASLGAGGGASSVPLSVLILSLLLVFIMSVFVA 

AGLFVLVMKRRKKNQSDHTSTNNSDVSSFN 

MQYSVYGGGGGTGGHPHAHVHHRGPALPK 

VKTPAGHVYEYIPHPLGHMCKNPIYRSREGN 

avbu iKULHfcUvv J YSSNHHLQQQQQPPPPP 

QQPQQQPPPQLQLQPGEEERRESHHLRSPAYS 

VSTDBPREDLLSPVQDADRFYRGILEPDKHCST 

i rAuXMdLrJa i rKrrLbrAAYTF SPNYDLRRPH 

QYLHPGAGDSRLREPVLYSPPSAVFVEPNRNE 

YLELKAKLNVEPDYLEVLEKQTTFSQF 


1046 


2396 


A 


8736 


28 


452 


SPSAAGGLAWVSLALGSGSRGRDHSGSGVGT" 

AMAGALVRKAADYVRSKDFRDYLMSTHFW 

GPVANWGLPIAAINDMKKSPEIISGRMTFALC 

CYSLTFMRFAYKVQPRNWLLFACHATNEVA 

QLIQGGRLIKHEMTKTASA 


1047 


2397 


A 


8741 


673 


924 


ALPGTPQQTVTLNTDGKVKSFTSPHSNPNLPP 
AKFFTSLQSLNWSSHLFPSPATESVGKRGNAK 
PPTTKLLHSSPLWNFFAQQL 


1048 


2398 


A | 8747 


3 


5054 


PEVTKPSLSQ^TAASPIGSSPSPPVNGGNNARR 
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SEQ ID 
NO: of 
nucl- 
eotide 
seq- 
uence 


SEQ ID 
NO: of 
peptide 
seq- 
uence 


Met 
hod 


SEQ 
ID NO: 
in 
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09/496 
914 
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beginning 

nucleotide 

location 

correspondi 

ng to first 

amino acid 

residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue 
of peptide 
sequence 


Amino acid sequence (A=Alanine OCysteine, 
D=Aspartic Acid, E=Glutamic Acid, 
Phenylalanine, OGIycine, HHHistidine, 
Msoleucke, K«Lysine, LrLeucine, 
M«Methionine, N=Asparagine, P=Proline, 
Q=G1utamine, R=Arginine, S=Serine, 
T=Threonine, V=Valinc, W-Tryptophan, 
Y=Tyrosine, X=Unknown, *=Stop codon, 
/=possible nucleotide deletion, \=possible 
nucleotide insertion 














VAVPNGQPPSAARYMPREVPPRFRCQQDHK ' 

VLLKRGQPPPPSCMLLGGGAGPPPCTAPGAN 

PNNAQVTGALLQSESGTAPDSTLGGAAASNY 

ANSTWGSGASSNNGTSPNPIHIWDKVIVDGS 

DMEEWPCIASKDTESSSENTTDNNSASNPGSE 

KSTLPGSTTSNKGKGSQCQSASSGNECNXGV 

WKSDPKAKSV QSSNSTTENNNGLGN WRNVS 

GQDRIGPGSGFSNFNPNSNPSAWPALVQEGTS 

RKGALETDNSNSSAQVSTVGQTSREQQSKME 

NAGVNFWSGREQAQIHNTDGPKNGNTNSL 

NLSSPNPMENKGMPFGMGLGNTSRSTDAPSQ 

STGDRKTGSVGSWGAARGPSGTDTVSGQSNS 

GNNGNNGKEREDSWKGASVQKSTGSKNDS 

WDNNNRSTGGSWNFGPQDSNDNKWGEGNK 

MTSGVSQGEWKQPTGSDELKJGEWSGPNQPN 

SSTGAWDNQKGHPIXENQGNAQAPCWGRSS 

SSTGSEVEGQSTGSNHKAGSSDSHNSGRRSY 

RPTHPDCQAVLQTLLSRTDLDPRVLSNTGWG 

QTQIKQDTVWDIEEVPRPEGKSDKGTEGWES 

AATQTKNSGGWGDAPSQSNQMKSGWGELS 

ASTEWKDPKNTGGWNDYKNNNSSNWGGGR 

PDEKTPSSWNENPSKDQGWGGGRQPNQGWS 

SGKNGWOEEVDQTKNSNWESSASKPVSGWG 

EGGQNEIGTWGNGGNASLASKGGWEDCKRS. 

PAWNETGRQPNSWNKQHQQQQPPQQPPPPQ 

PEASGSWGGPPPPPPGNVRPSNSSWSSGPQPA 

TPKDEEPSGWEEPSPQSISRKMDIDDGTSAWG 

DPNSYNYKNVNLWDKNSQGGPAPREPNLPTP 

MTSKSASDSKSMQDGWGESDGPVTGARHPS 

WEEEEDGGVWNTTGSQGSASSHNSASWGQG 

GKKQMKCSLKGGNNDSWMNPLAKQFSNMG 

LLSQTEDNPSSKMDLSVGSLSDKKFDVDKRA 

MNLGDFNDMRKDRSGFRPPNSKDMGTTDS 

GPYFEKGGSHGLFGNSTAQSRGLHTPVQPLN 

SSPSLRAQVPPQHSPQVSASMLKQFFNSGLSP 

GLFN VGPQLSPQ Q1AML SQLPQIPQFQLACQL 

LLQQQQQQQLLQNQRK1SQAVRQQQEQQLA 

RMVSALQQQQQQQQRQPGMKHSPSHPVGPK 

PHU)NMVPNALNVGLPDLQTKGPIPGYGSGF 

SSGGMDYGMVGGKEAGTESRFKQWTSMME 

GLPSVATQEANMHKNGAIVAPGKTRGGSPY 

NQFDIIPGDTLGGHTGPAGDSWLPAKSPPTNK 

IGSKSSNASWPPEFQPGVPWKGIQNIDPESDP 

YVTPGSVLGGTATSPIVDTDHQLLRDNTTGS 

NSSLNTSLPSPGAWPYSASDNSFTNVHSTSAK 

FPDYKSTWSPDPIGHNFnfl,SWKMWKNHISS 

RNTTPLPRPPPGLTNPKPSSPWSSTAPRSVRG 

WGTQDSRL AS AST WSD GGS VRPSYWL VLHN 

LTPQIDGSTLRTICMQHGPLLTFHLNLTQGTA 

LIRYSTKQEAAKAQTALHMCVLGNTTILAEF 

ATDDEVSRFLAQAQPPTPAATPSAPAAGWQS 

LETGQNQSDPVGPALNLFGGSTGLGQWSSSA 

GGSSGADLAGASLWGPPNYSSSLWGVPTVED 

PHRMGSPAPLLPGDLLGGGSDS1 


1049 


2399 


A 


8748 


200 


1387 


VPWKRQDEQLSLQVE1XYLDSPAV1HLLSPTF 
LPPSSLPPFLQIVDSSSSACTLDSFFPFLAPWDS 
PQDCGFKDHQPLTLQALTVELARWTLMLLLS 
TAMY GAHAPLLALCHVDGRVPFRPSSAVLLT 
ELTKLLLCAFSLLVGWQAWPQGPPPWRQAA 
PFALSALLYGANNNLVIYLQRYMDPSTYQVL 
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SEQID 

NO: of 
nucl- 
eotide 
seq- 
uence 


SEQID 
NO: of 
peptide 
• seq- 
uence 


Met 
hod 


SEQ 
ID NO: 
in 

USSN 
09/496 
914 


Predicted 

beginning 

nucleotide 

location 

correspondi 

ng to first 

amino acid 

residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue 
of peptide 
sequence 


Amino acid sequence (A^Alanine OCysteine, 
D=Aspartic Acid, E=Glutamic Acid, 
F=PhenyIalanine, G=Glycine, H=Histidine> 
Hsoleucine, K=Lysine, b-Leucine, 
M^Methionine, N=Asparagine, P=Proline, 
QKJlutamine, RpArginine, S=Serine, 
^Threonine, V«Va!ine, W=Tryptophan, 
Y^Tyrosine, X=Unkaown, *=Stop codon, 
possible nucleotide deletion, \=possible 
nucleotide insertion 














SNLKIGSTAVLYCLCLRHRLSVRQGLALLLL 

MPLHITPLGLLULILYCUSGLSSVYTELLMKR 
QRLPLALQNLFLYTFGVLLNLGLHAGGGSGP 
GLLEGFSGWAALVVLSQALNGLLMSAVMKH 
GSSITRLFWSCSLWNAVLSAVLLRLQLTAA 
FFLATLUGLAMRLYYGSR 


1050 


2400 


A 


8758 


3 


1660 


W VS SMGFEELLEQVGGFGPFQLRNVALLALP 

RVLLPLHFLLPIFLAAVPAHRCALPGAPANFS 

HQDVWLEAHLPREPDGTLSSCLRFAYPQALP 

NTTLGEERQSRGELEDEPATVPCSQGWEYDH 

SEFSSTIATESQWDLVCEQKGLNRAASTFFFA 

GVLVGAVAFGYLSDRFGRRRLLLVAYVSTLV 

LGLASAASVSYVMFAITRTLTGSALAGFTUV 

MPT FT FWT TWFHTJTVAflVT QCTPU/TPPtAJi 

LALVGYL1RDWRWLLLAVTLPCAPG1LSLWW 

VPESARWLLTQGHYKEAHRYLLHCARLNGR 

PVCEDSFSQEAVSKVAAGERWRRPSYLDLF 

RTPRLRHISLCCVWWFG\ r NFSYYGLSLDVS 

GLGLNVYQTQLLFGAVELPSKLLVYLSVRYA 

GRRLTQAGTLLGTALAFGTRIXVSSDMKSWS 

TVLAVMGKAFSEAAFTTAYLFTSELYPTVLR 

QTGMGLTALVGRLGGSLAPLAALLDGVWLS 

LPKLTYGGIALLAAGTALLLPETRQAQLPETI 

QDVERKSAPTSLQEEEMPMKQVQN 


1051 


2401 


A 


8759 


515 


1625 


EIRTPVAVSSAPSGDSEGDEEETTQDEVSSKTS 

FFnrJflWTf VPTfFT FKTTFnDVn/^UX/vrniXDv/ 
C/CtiJVju y vjvvc,jsjijUJiiN IJi^rVUUiNiiV VJixlbV 

TGNLNSDPLLELCQCPLCQLDCGSREQLIAHV 

YQHTAAWSAKSYMCPVCGRALSSPGSLGR 

HLLIHSEDQRSNCAVCGARFTSHATFNSEKLP 

EVLNMESLPTVHNEGPSSAEGKDIAFSPPVYP 

AGiaV(^CAAYRKLLEAQTPSVRKWALRR 

QNEPLEVRLQRLERERTAKKSRRDNETPEERE 

VRRMRDREAKRLQRMQETDEQRARRLQRDR 

EAMRLKRANETPEKRQARLIREREAKRLKRR 

LEKMDMMLRAQFGQDPSAMAALAAEMNFF 

QLPVSGVELDSQIXGKMAFEEQNSSSLH 


1052 


2402 


A 


8763 


1106 


70 


RHGHGGRDRRGGGRVARPGGLGRYPGRGAA 
ART-VFVPTRHRRfrP^ATAWA AMI A VT-rcrwr 1 a 

mjLi t r v i j IVJVrvOvJl OVJ I /xO V t\J\WL/\ I itoVJ x v#A 

HGSKHRARAAPDPPPLFDDTSGGYSSQPGGY 

PATGADVAFSVNHLLGDPMANVAMAYGSSi 

A5HGKDMVHKELHRFVSVSJKLKYFFAVDTA 

YVAKKLGLLVFPYTHQNWEVQYSRDAPLPP 

RQDLNAPDLYIFTMAFITYVLLAGMALGIQK 

RFSPEVLGLCASTALVWWMEVLALLLGLYL 

ATVRSDLSTFHLLAYSGYKYVGMILSVLTGL 

LFGSDGYYVALAWTSSALMYFTVRST RTAAT 

jit if i kjorxjuivx Al ly ivOi^Xv l ^A/vL 

GPDSMGGPVPRQRLQLYLTLGAAAFQPLnY 
WLTFHLVR 


1053 


2403 


A 


8768 


2 


712 


RPPRVWYPELRELSAAAPRWSHRTAPGIMVF 

YHSSSVNSSAYT1YMGKDKYENEDLIKHGW 

PEDIWFHVDKLSSAHVYLRLHKGENIED1PKE 

VUVIDCAHLVKANSIQGCKM^VNVWTPW 

SNLKKTADMDVGQIGFmQKDVKIVTVEKK 

VNEILNRLEKTKVERFPDLVAEKECRDREER 

NEKKAQIQEMKKREKEEMKKKREMDELRSY 

SSLMKVENMSSNQDGNDSDEFM 


1054 


2404 


A | 


8769 


344 


527 


imriXACRNSCWVFSRCSLGACKPTVCSMP 
SLSRQGSQTLCLRLAEYCMESVDSQRLLLS 
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SEQ ID 
NO: of 
nucl- 
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seq- 
uence 


SEQ ID 
NO: of 
peptide 
seq- 
uence 


Met 
hod 
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ID NO: 
in 
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09/496 
914 


Predicted 

beginning 

nucleotide 

location 

correspond! 

ng to first 

amino acid 

residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue 
of peptide 
sequence 


Amino acid sequence (A=Alanine OCysteine, 
D-Aspartic Acid, E«<Jlutamic Acid, 
^Phenylalanine, OGlycine. H«Histidine, 
Msoleucine, K=Lysine, L*=Leucine, 
M=Methionine, N=Asparagine, P=Proline, 
Q=Glutamine, R=Arginine, S=Serine, 
T=Threonine, V=VaIine, W^Tryptophan, 
Y«Tyrosine, X^Unknown, **$top codon, 
possible nucleotide deletion, Nppossible 
nucleotide insertion 


1055 


2405 


A 


8770 


430 


1104 


QQESPAAGAARMNCKfiGTDSSCGCRGNDEK 
KMLKCVWGDGAVGKTCLLMSYANDAPPEE 
YVPTVFDHYAVTVTVGGKQHLLGLYDTAGQ 
EDYNQLRPLSYPNTDVFLICFSWNPASYHNV 
QEEWVPELKDCMPHVPYVLIGTQLDLRDDPK 
TLARLLYMKEKPLTYEHGVKLAKAIGAQCYL 
ECSALTQKGLKAVFDEA1LTIFHPKKKKKRCS 
EGHSCCSII 


1056 


2406 


A 


8773 


261 


332 


NPRIQLSGNSCCAGSCRVWLSEQ 


1057 


2407 


A 


8778 


3 


477 


PAGIRHEQARGADRMGKCRGLRTARXLRSH 
RRDQKWHDKQYKKAHLGTALKANPFGGAS 
HAKGIVLEKVGVEAKQPNSAIRKCVRVQLIK 
NGKKITAFVPNDGCLNFIEENDEVLVAGFGR 
KGHAVGDIPGVRFKVVKVANVSLLALYKGK 
KERPRS 


1058 


2408 


A 


8808 


171 


881 


PGLSQEPSGSMETW1VAIGVLATIFLASFAAL 

VLVCRQRYCRPRDLLQRYDSKPIVDLIGAME 

TQSEPSELELDDWTTNPHIEAILENEDWIEDA 

SGLMSHCIAILIGCHTLTEKLVAMTMGSGAK 

MKTSASVSDnVVAKRISPRVDDWKSMYPPL 

DPKLLDARTTALLLSVSHLVLVTRNACHLTG 

GLDWLDQSLSAAEEHLEVLREAALASEPDKG 

LPGPEGFLQEQSAI 


1059 


2409 


A 


8809 


246 


757 


MRLQGAIFVLLPHLGPILVWLFTRDHMSGWC 

EGPRMLSWCPFYXVLLLVQTAIYSWGYASY 

LVWKDLGGGLGWPLALPLGLYAVQLTISWT 

VLVLFFTVHNPGL ALLHLLLL YGL V V STALI 

WHPINKXAALLLLPYLAWLTVTSALTYHLWR 

DSLCPVHQPQPTEKSD 


1060 


2410 


A 


8810 


304 


381 


PKLSVYPLXJSHHCLSEPFQSLVCCLA 


1061 


2411 


A 


8820 


1673 


848 


SCKTENLLEMWWFQQGLSFLPSALVIWTSAA 

FIFSYrTAVTLHHIDPALPYISDTGrVAPEKCLF 

GAMLNIAAVLCIATIYVRYKQVHALSPEENVI 

IKLNKAGLVLGILSCLGLSIVANFQKTTLFAA 

HVSGAVLTFGMGSLYMFVQTILSYQMQPKIH 

GKQVFWIRLLLVIWCGVSALSMLTCSSVLHS 

GNFGTDLEQKJLHWNPEDKGYVLHMITTAAE 

WSMSFSFFGFFLTYIRDFQKISLRVEANLHGL 

TLYDTAPCP1NNERTRLLSRDI 


1062 


2412 


A 


8824 


1 


763 


GGAPPASVPARESPVSGAQGSSRTRGHKRAA 

GARAPQLCSSWQRRSAPAMSRGLQLLLLSCA 

YSLAPATPEVKVACSEDVDLPCTAPWDPQVP 

YTVSWVKLLEGGEERMETPQEDHLRGQHYH 

QKGQNGSFDAPNERPYSLKJRNTTSCNSGTYR 

CTLQDPDGQRNLSGKVILRVTGCPAQRKJBET 

FKKYRAEI VLLLAL VIF YLTLJQFTCKFARLQ SI 

FPDFSKAGMERAFLPVTSPNKHLGLVTPHKT 

ELV 


1063 


2413 


A' 


8826 


147 


627 


CETSTSSAGHAPCRHAAQGPPAEPTGLRLCSE 

HQRLHAWPFGPRRPSLWPPKNGKWHSGKRT 

AGGRPQRRPSRRQSQRPSAWSGSPRMHSPGQ 

KCSLMCPHRSQDSLSTAIFQRSPGANTGRALH 

CVLSKEMKSVQRSLGLSRIHLQSKRKDHFVL 

TR 


1064 


2414 


A 


8835 


2982 


1869 

i 


LKDTLKJSQMTQEASDEAEDMKEAMNRMIDE 
LNKQVSELSQLYKEAQAELEDYRKRKSLEDV 
TAEYMCAEHEKLMQLTNVSRAKAEDALSE 
MKSQYSKVLNELTQLKQLVDAQKENSVSTTE 
HLQ VITTLRTAAKEMEEK1 SNLKEHLASKE VE 
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SEQID 
NO: of 
nucl- 
eotide 
seq- 
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SEQID 
NO: of 
peptide 
seq- 
uence 


Met 
hod 
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ID NO: 
in 
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914 


Predicted 
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nucleotide 

location 

correspond! 

ng to first 

amino acid 

residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue 
of peptide 
sequence 


Amino acid sequence (A^Alanlne OCysteine, 
D=Aspartic Acid, E=01utamic Acid, 
F^Phenylalanine, G=Glycine, H=Histidine, 
Hsoleucine, K«Lysine, L«Leucine, 
M=Methionine, N=Asparagine, P=Proline, 
Q=01utamine, R=Arginine, S^Serine, 
T«Threonine, V=Valine, W=Tryptophan, 

YslVracinfi Y=l Jnlrnnum *=Mtnn rnAnn 
* /v-\jiiAJluwjJt, T — owp COuOH, 

/=possib!e nucleotide deletion, \=possible 
nucleotide insertion 














VAKLEKQLLEEKAAMTDAMVPRSSYEKLQS 
SI^BVSVIJVSKLKESVK^KEKVHSEVVQIRS 
EVSQVKREKEN1QIUKSKKQEVNELLQ1CFQ 
QAQEELAEMKRYSESSSKLEEDKDKKINEMS 

KFVTTT TCVA1 XTCT COr QVCTCCWI>nen/M r a 
rkEr V 1 AJLfwC/UdN dLrOy LiO I J J doolvKy av^LBA 

LQQQVKQLQNQLAECKKQHQEVISVYRMHL 
LYAVQGQMDEDVQKVLKQILTMCKNQSQK 


1065 


2415 


A 


8841 


3 


663 


AAAtAASLSPRO0RLRTl>!!SDV6PSRAPl>PSA 

APLPTGRAQMSPSGRLCLLTIVGLILPTRGQTL 

KDTTSS SSADATIMDIQVPTRAPD AVYTELQP 

TSPTPTWPADETPQPQTQTQQLEGTDGPLVT 

DPETHKSTKAAHPTDDTTTLSERPSPSTDVQT 

DPQTLKPSGFHEDDPFFTOEHTLRKRGLLVA 

AVLFITGIIILTSGKCRQLSRLCRNHCR 


TOR 


2416 


A 


8853 


3806 


2204 


FVGEQEGGCEAGAGRGAQTYPGEAGERWFG 

RRRRRGRWSRKKMSLKSERROIHVDQSDLL 

CKKGCGYYGNPAWQGFCSKCWREEYHKAR 

QKQIQEDWELAERLQREEEEAFASSQSSQGA 

QSLTFSKFEEKKTNEKTRKVTTVKKFFSASSR 

VGSKKEIQEAKAPSPSINRQTSIETDRVSKEFIE 

FLKTFHKTGQEIYKQTKLFLEGMHYKRDLSIE 

EQSECAQDFYHNVAERMQTRGKVPPERVEKI 

MDQlbh. l IMTRLYkY WCPETTDDEKKDLAI 

yiSJuKALKW V 1 r QMLLVPVNEDIPEVSDMVV 

KITKNEPASADDFLPTLIYIVLKGNPPRLQSNI 
QYITRFCNPSRLMTGEDGYYFrNLCCAVAFIE 
KLDAQSLNLSQEDFDRYMSGQTSPRKQEAES 
WSPDACLOVKQMYKNLDLLSQLNERQERIM 
NEAKKLEKDLIDWTDGIAREVQDIVEKYPLEI 
KPPNQPLAAEDSENVENDKLPPPLQPQVYAG 


1067 


2417 


A 


8855 


1372 


1513 


SNMREVGCGWLVPVIPAFWEAEVGGSLEARS 
LRQAWATKQDP1SKKK 


1068 


2418 


A 


8856 


1530 


1583 


PCRPGMECNSM1SVHCNL 


1069 


2419 


A 


8857 


1530 


1583 


PCRPGMECNSMISVHCNL 


1070 


2420 


A 


8866 


293 


1675 


PYPQGGYPQGPYPQEGYPQGPYPQGGYPQGP 
YPQSPFPPNPYGQPQVFPGQDPDSPQHGNYQ 
EEGPPSYYDNQDFPATNWDDKSIRQAFIRKVF 
LVLTLQLSVTLSTVSVFTFVAEVKGFVRENV 
ty x i x vox AYrrioirfi YL>oCvwlyrlvKftJir 
VALSVLTASLSYMVGMIASFYNTEAVIMAVG 
ITTA VCFTVVIFSMQTR YDFTS CMG VLL VSM 
WLFIFAILCIFIRNRJLEIVYASLGALLFTCFLA 

FLYHTnGRAKE*PSSSSLCPLRWHGWPGPCP 

WHGSASCTSPLSCPQAQPREKDASLQPSCMY 

TADTSIWTRCGHSMAPLVLPPPPRGTKAIFPC 

HLLSTHCCMSPVCQPTPGTGGSTRSRGEGLSQ 

EVRVHVFPPVPAPQPG\TEHPSPPPHPPGVLPS 

GDMRSGGLIPVLSPE 


1071 


2421 


A 


8868 


2 


358 


ARGNTLYHLPRLCRKLNLRWFSASTLYDVQH 
DDKMGSNTFFKRNDCRYVMISCKADMAYDN 
VRHPFMI* S]\KLIMEETYLN]IKAVYDRPTASII 
LNGEKLKVFPVRSGT*QGCSVWP 


1072 


2422 


A 


8870 


33 


658 


MESVLSKYEDQITIFTDYLEEYPDTDELVVVIL 
GKQHLLKTCKSKIXSDISVUU.WFTYRRKFSPI 
GGTGPSSDAGWGCMLRCGQMMLAQALICRH 
LGRDWSWEKQKEQPKEYQRILQCFLDRKDC 
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CYSIHQMAQMGVGEGKS1GEWVLGPNTAAAQ 
GV*KNLA\LFDEW\NSLGLVYVSM\DNPSGSIA 
RFPKKLCRVLPLVSADTAGLTGP 


1073 


2423 


A 


8879 


146- 


412 


DFS V* GD VDIEVTCPICLQLLTEPLSLNCGLRL 
*QVCITA*IKESVnSGG*SSSPVCHTTFQPANL 
RTSRYLPT* SIKSLGPDEPQEG 


1074 


2424 


A 


8884 


67 


435 


HLQGRSIRTLQLTGlE^EKMCEVSERjRkSGPW 
KEISFGDYICHTFQGDCWADRSPLHEAAAHG 
RLLALKTLIAQGVNVNLWTL/DRVSSLHEACL 
♦GPVACAKPYWKMVPRHGGTVTGPPLLMV 


1075 


2425 


A 


8896 


1294 


248 


RSGDRNGLTHQLGGLSQGSRNQSYRSRSRSR 

SRERPSAPRGIPFASASSSVYYGSYSRPYGSDK 

PWPSLLDKEREESLRQKRLSERERIGELGAPE 

VWGLSPKNPEPDSDEHTPVEDEEPKKSTTSAS 

TSEEEKKKKSSRSKERSKKRRKKKSSKKKHK 

KYSEDSDSDSDSETDSSDEDNKRRAKKAKKK 

EKKXKHRSKKYKKKRSKKSRKESSDSSSKES 

QEEFLENPWKDRTKAEEPSDLIGPEAPKTLTS 

QDDKPLNYGHALLPGEGAAMAEYVKAGKRI 

PRRGEIGLTR*RNCHHLNAQVM* * WSRHRR 

MEAVRTAKREPESTVLNIRREPLHPFNPRRET 

KERE 


1076 


2426 


A 


8899 


146 


789 


GRSTEAEKEPAFDERTGKGRRLPRAGEFHG* E 

♦APGPGPRSFQVSRKMPEEVPPGARKHPFSGKS 

FYLDLPAGKNLQFLTGAIQQLGGVIEGFLSKE 

VSYIVSSRREVKAESSGKSHRGCPSPSPSEVR 

VETSAMVDPKGSHPRPSRKPVDSVPLSRGKE 

LLQKAIRNQK* *CTVQQLSHCRLY\GEKTTAK 

RSQREHVQQQSQEHGKWPDLKGPR 


1077 


2427 


A 


8901 


352 


3 


AKIGAYKYIQELWRKKQSDVMHFLLRVRCW 
QYPALHRAGTEWQLSALHRAPRSTQPDKAC 
RLGYKAKQGYHYRICVRRGGWKCPVPKAVT 
\YGKPVHHGVN*LKFAQSLQSVAEEQ 


1078 


2428 


A 


8905 


536 


781 


ACPAENREVPEMAAGQAPHAGPGAGPGQPA 
PALPFAATPGSRGQALCRGGRRRQHLHGPLH 
RP * QAAPALHAGCQL APHPPT 


1079 


2429 


A 


8912 


121 


376 - 


NLIWKLCVTERRLVlLDNYDLASEA r EANKYI 
CNRHQFKPGQDKYFTLGLPTGSTPL*CYPKLI 
EYNKNGHLSFKYVKTFSMDEY 


1080 


2430 


A 


8920 


381 


1788 


SSESPSDPGRMAMTWIVFSLWPLTVFMGHIG 

GHSLF SCEPITLRMCQDLPYNTTFMPNLLNHY 

DQQTAALAMEPFHPMVNLDCSRDFRPFLCAL 

YAPICMEYGRVTLPCRRLCQRAYSECSKLME 

MFGVPWPEDMECSRFPDCDEPYPRLVDLNLA 

GEPTEGAPVAVQRDYGFWCPRELKIDPDLGY 

SFLHVRDCSPPOPNMYFRREELSFARYFIGL1S 

nCLSATLFTFVTFLIDVTRFRYPERPIKCYAV 

WHMMVSL1FF\IGFLLEDRVACNA\SIPAQYKA 

STWQGSHNKACTMLFNflLYFFTMAGSVWW 

VlLTrrWFIAAWKWGSEAIEKKALLFHASA 

WGIPGTLTIILLAMNKIEGDNISGVCFVGLYD 

VDALRYFVLAPLCLYWVGVSL1XAGIISLNR 

VRIEIPL*KENQDKLVKFMIRIGVFSILYLVPLL 

WIGCYFYEQAYRGIWETTWIQERC 


1081 


2431 


A 


8922 


56 


420 


EERTKMSTGPDYKATVGDISSDGNLNVAQEE 
CSRKGIVDEFFPLLSN*CIWTQPQGYPQSSYG 
TLANFVFVCSVRHGLALILQLCNFSIYTQQMN 
LSIA1PAMVNNTAPPS QPNASTERPST 


1082 


2432 


A 


8923 


355 


1079 


PFGTPSSTMAVVKNKCLMKGGKKGVKKKVV 
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GxTSrJCDQYD^J>^\IK^TuNTGK>TLVART 

QGTQIASDGLKGLLFEVSLADLQNDEVAFRK 

FKLITEDVQDKNCLTNFYGMDLTCDKICSMV 

EKWSTMIEAHVD VKTTDG YFFHLFCV GFTKK 

HNNQILKTSYA*HQQS/RQIQKKMMEIMT*EV 

QTNDUCEVVNKLIPDN1GKDTEKV/CPIYPLH 

DVFIRKVKMLENPGFER\MELRGGGSSS 


1083 


2433 


A 


8948 


28 


385 


LTWPQPHIPSCPAMSEETLQSKLAAAKKKLP ' 
WGAVQGSRAMSDLLLLLLDLTLLLLLMLXGF 
AGYSGQLAGVAVSAGSPPI/RYKFHVEPYGET 
GWLLT/ESCSISPKLCSIAVH*DNPAWF 


1084 


2434 


A 


8950 


156 


318 


HYTPINTDTIENSENNKC W 4 GY*E\VGLIHHW 
WGGKRVQPFWKRVWQKRTLNLRV 


1085 


2435 


A 


8956 


16 


413 


HMGQLGYFIQCWWECKRLISRWKTl*QSPAK 

*TIYTSYDTAIPIS/GI/YPKRMSSKCHQETCAR 

MFILAPFTATIKGKQLTCPLVEERIDY\MWYS 

HKYm\'TCRM,*YTlTH\TWVNLNILMFEIILW 

YSHKYY 


1086 


2436 


A 


8962 


868 


1026 


H*KILQVGRAQRAHXSRL*SQLLRRLRHESHL 
NPGARGCSEARLHRCTPAWTT 1 


1087 


2437 


A 


8985 


58 


330 


LHVKHLGHFQLVFSEYICHCILMPVS*ELQRL 
*ERSVCAFHVCIQTYVCLQVYACMCVYYICM 
FVYSVYGCGLCTCVCMDVYTCVCVQEFl. 


1088 


2438 


A 


8989 


394 


404 


n*kwllhvnvrlqsiff/ionqk/inshelkld 
kkfldmmsna* stkkhdkld/likfkt/lcs a 
kytvkiukihptdlekmlrnhlsdkd*ys/gv 
ykdlsklnrrkte/s*/vkkwvbcdlsryfike 
vismenkhkkifsts 


1089 


2439 


A 


8991 


60 


329 


MALTPESFSSFPGLAATGSSVPEPPGGPNATL 
NSSWDSPTEPSSLEDLEATGTIGTLLSDMGVV 
GVEDNAYTLEVNSRYMRAVGIM*IHL 


1090 


2440 


A 


8996 


2 


351 


SNITITLT*MKKYDNTFC W* GCGQIG/T/LIYC ' 
WQESKFIQ AFWSKIQQYLA* ISHDLFDPAFLFL 
GGYPGGTQSVFLTGVLVSSVFYNMKMLHTR 
LLIAALFUVQYWKQSKDHYI 


1091 


2441 


A 


8997 


97 


456 


YPLPVCSYLSGPRGEHWNSLGGKSSCPLPLPT 
LVSSRFKISKVIWGDLSVGKTCLINR*GGAG 
AELGRVGPSLARWAGSRSQHLVPSQWCKDS 
FDKNYKAPIGADFEMERFEVLGIPF 


1092 


2442 


A 


8999 


548 


811 


SSFIKRHILIFEDDWHQTTCCHHPHHP\F*RCQ 
FHIFYVSVQNSISPSLSVSSSHPDRPDHEVHQH 
RAAHHHQHGQGPLGHGLVARVG 


1093 


2443 


A 


9002 


3 

1 


2745 


ALLGLQQPAQSLILSRSSVMGVRGLQGFVGS 

TCPHICTVVNFKELAEHHRSKYPGCrTFTIWD 

AMCCLRYWYTPESWICGGQWREYFSALRDF 

VKTFTAAG1KLIFFFDGMVEQDKRDEWVKRR 

LKNNREISRIFHYIKSHKEQPGRNMFFDPSGLA 

VFTRFALKTLG QETLCSLQEADYEVASYGLQ 

HNCLGILGEDTDYLIYDTCPYFSISELCLESLD 

TVMLCREKLCESLGLCVADLPLLACLLGNDII 

PEGMFESFRYKCLSSYTSVKENFDKKGN11LA 

VSDHISKVLYLYQGEKKLEEILPL/VTKQSSFL 

+RNGIISFTRT/INLHGFSKNPKV* +LWTNK* YP 

RVQTPNTCKKFPCVQMUsTGKKFPCVQALNP 

GEKFPCIHI/PEPRQEVPTCSDPEPRQEVPTCTG 

PESRREVPMCSDPEPRQEVPMCTGPEPRQEVP 

MCTGPEARQEVPMCTDSEPRQEVPMCTDSEP 

RQEVPMYTGSEPRQEVPMYTGPESRQEVPMY 

TGPESRQEVLIRTDPESRQE1MCTGHESKQEV 
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nucleotide insertion 














PICTDPISKQEDSMCTHAEINQKLPVATDFEFK 

LEALMCTNPEIKQEDPTNVGPEVKQQVTMVS 

DTEILKVAR.THHVQAJSSYLVYN1MSSGEIECS 

NTLEDELDQALPSQAFIYRPIRQRVYSLLLED 

CQDVTSTCLAVKEWFVYPGNPLRHPDLVRPL 

QMTIPGGTPSLKILWLNQEPEIQVRRLDTLLA 

CFNLSSSREELQAVESPFQALCCLLIYLFVQV 

DTLCLEDLIIAFIAQALCLQCKSTSQLVNLQP 

DY1NPRAVQLGSLLVRGLTTLVLVNSACGFP 

WKTSDFMPWNVFDGKLFHQKYLQSEKGYA 

VEVL/CRTKnSAHQIPQPEGSRLQGLHEGEQT 

HHWPSPLGLTPRREVGKTGLQLPQDGLWV 


1094 


2444 


A 


9021 


97 


834 


AREACRAKTDFPGRRFRLWPSCCCRVIVGAE 

T*H\MAEPVSPLKHFVLAKBCAITAIFDQLLEFV 

TEGSHFVEATYKNPELDRIATEDDLVEMQGY 

KDKLSHGEVLSRRHMKVAFFGRTSSGKSSVI 

NAMLWDKVLPSGIGHITNCFLSVEGTDGDKA 

YLMTEGSDEKKSVKTVNQLAHALHMDKDLK 

AGCLVRVFWPKAKCALLRDDLVLVDGPGTD 

VTTELDSWIDKFCTKSSTREITNSGSDT 


1095 


2445 


A 


9022 


1 


537 


LVLNSRVEDFVPPEGAGRTLPFALRPLAACW 
LLHRRARRSSALCPRPRSWGVSGGEGAGARE 
P* ITSS SCCLSAA/SHLSIQSPNMAGARRRIRPQ 
LAKEKIEGCHICTSVTPGEPQVFLGKDKAFTF 
DYVFDIDSQQEQIYIQCIEKLIEGCFEGYNATV 
FAYGQTVGAGKTYTMGTGFD 


1096 


2446 


A 


9029 


1 


285 


FFFFNVCKSPKVPKPGCKEESTGTLFKNTLISL 
GQHSETPSLKKK\L AGYSGMCL* SQ VLRRLRQ 
EDCLSPGGGNCRES* SCPYTP AWTTERDP V 


1097 


2447 


A 


9032 


716 


357 


ARSTGFWGEILWCGFLKRSLALSPRVKCSGAI 
LAHCNFRHAGFPPLSCLSLPNRWEYRRPPARP 
GKFFLVFLVETGFQC/G*DGLDLLTSRSACLG 
LPKC WD YRREP AAS IIF QTTFFIN S K 


1098 


2448 


A 


9038 


230 


652 


KVWMSCEDINISGSFYRNKLKYLAFLCKRTS 
TNPSQGPYHLWVPSH1FWQTTCGRLPHKTKQ 
G*AALDHLKVFDRIPLPYDKKKQMAVSATLE 
WRPKP* RKFAYLGHWAQKVDWKYQAMT A 
TMGEKRKVYYQKICYQKX 


1099 


2449 


A 


9043 


185 


372 


I1FYSHQQCMRV/WQGCGDIETLIHCW*E*KII 
HSL/WK/TV*QFLKRLYLHLPHNSVIAFLGISP 
RKIKTCPQNSCTSMLINAIHNDQKWKKINI 


1100 


2450 


A 


9045 


763 


584 


RQSLALSPRLECSGTISAHCRLCPLVFTPLSCL 
SLTSSWDYRRPPPHPANFLYFK*RRGF 


1101 


2451 


A 


9050 


275 


2 


tFFLkKVSNQFLSP$LLPV}*Q6FVFAFLLLLL 
FLL/FEMESLP VA/RVECSGTIS AHCNLCLPGS S 
DSPASAS*VAGITDMCRYTQLILFHAS 


1102 


2452 


A 


9053 


449 


1224 


KTSMFWKFDLHSSSH1DTLLEREDVTLKELM 

DEEDVLQECKAQNRKUEFLLKAECLEDLVSF 

I\* EEPPQDMDEKIRYKYPNISCELLTSDVSQM 

NDRLGEDESLLMKLYSFLLNDSPLNPLLASFF 

SKVI^n.ISRKPEQIVDH.KKJKHDFVDLIIKHIG 

TSAIMDLLLRLLTCIEPPQPRQDVLN/WFKVQ 

RNL*HST*NVMDISKYVNLHWGLNKSHSLL* 

LLLQCVLQWLNEEKIIQRLVEIVHPSQEEDVS 

SLV 


1103 


2453 


A 


9058 


403 


3 


GLHVYDFQVYREHILTLNVKKCSVSFWGLRE 
WLYLQMYEDDKSPRFPIIKMTDITKC W* GCVGA 
AGMQI/H/CW\WCVNVGKFWEMS*YYLLKLSI 
ST/PYDPAlPIXGrYL»ETOVYlHPKTCMRMLIA 
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Amino acid sequence (A-Alarunc OCysteinc, 
D=Aspartic Acid, E=01ulamic Acid, 
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APFVLAVNC 1 


M04 


2454 


A 


9064 


f j 


37 J 


KWU?SSLMTGRGDIIGHLKWLDCK\NCSSFPI 
KRNRQTHSTESNKLKAOHSFOYN* LIH+NSW 

KTDCGCGANSKGWWMKV\KTAQQKQTTS 
YMQIGTTKNSRAT 


1105 


2455 


A 


9065 


366 


778 


DIXJLRNLAfPELKRRNCISRFYLAYHIiiKIYS 
RSILLCNNCSGFYILSL*QYDWFFrWFFRDR 
AWPCCPGWSAAWLTIVILAHYRRPGLERSCC 
LSLSSSWDHRRVPPCPANF+/YFSMGFTAFPRL 
VLNS'TQGI 


1106 


2456 


A 


9083 


673 


816 


ESGSL1H*WWENKPAQPLWWEI*QHVQKLPT 
HFPCDPAIPLLGICPED | 


1107 


2457 


A 


9086 


580 


18 


KPSSGSFIRAIYIFLSTAHVPALFSVLVR1KLT* 
AFSQSSVLWAHKQQKTSLSLVIR/ERLQIKTA 
VRENFLPIRLAKILKLDNVKCWQG/SGSNMSL 
I/HCWWEYNVIHnWNSVTFPRKVEHVYITYA 
PEISVR*IHGGLPTLVHQETHTSVFRGAPSVIP 
ETOCmKESINKLLHIYTMEHYGDENK 


1108 


2458 


A 


9093 


540 




GGNDGSVTPri'EPURKElT*KRKF*EKTDRLP 

GA/PPSRTPPTPYPCPHGDRLLPPSRPLPAGPA 

SAFPPAERSRGHRRASL*RARWSAAVPRRSA 

GSASEPVQSRWLRLPVGSDSPPAVPVRVCPAP 

DSRPAAPGSRLPDPGLDSPAPSRTPSSSVD*OG 

QRPPPPSGDSLSPPGCCRY 


1109 


2459 


A 


9099 


1255 


1425 


HE5YHVNPNLCNPVAPTSGAHSIG*KWPSWL 
GAVAHSCNPSTLVGRGGRITRGQELR 


1110 


2460 


A 


9103 


242 


70 


EEQFFFFAVGMFP*VDFLAPASGELWDRLRLT 
CSRPFTRHQSFGLAFLRVCSSLDSLDDSWGP 
SALLSSVL/NQGGRNVLEAREAAKHPTI*RQS 
LLRKQRNKRMAIP 


111! 


2461 


A 


9110 


189 


121 


SFLSVRLECNGAIMAHCALPLPG 


1112 


2462 


A 


9113 


100 


910 


RRRGGGSRPRRTPVPAPGPGPSFGMDVRFYP 

AAAGDPASLDFAQCLGYYGYSKFGNNKNYM 

NMAEANNAFFAASEQTFHTPSLGDEEFEIPPIT 

PPPESDPALGMPDVLLPFQALSDPLPSQGSEFT 

PQFPPQSLDLPSITISRKLVEQDGVLHSSGLHM 

DQSHTQVSQYRQDPSLIMR\PSST*PDAARSG 

VMPPAQLTTINQSQLSAQLGLNLGGASMPHT 

SPSPPASKSATPSPSSSINEEDADEANRAIGEK 

RAAPDSGKKPKTPKK 


1113 


2463 


A 


9120 


3452 


3051 


FLRPSFALVPQAOVQWCALSWLQPPSPRFK*F 

SCLSLPSSWDYRHVPPRPANFFVLLVETGFLH 

VGQAGHEPLTSGDPPASASQSAGITGVSHQA 

WPSFFIFSRDTVLLCCSGWSRTSGLKQSACLS 

LLKCWDY 


1114 


2464 


A 


9122 


152 


377 


NQLPLQQWTFFrYETGFCSVAQAGVQCRDHS " 

SUff*PPG\SSDPPAPPS*VLGITGQRYHACLII 

YLYVQTVPQRV 


1115 


2465 


A 


9124 


553 


981 


ORPLLROOLGSWPTCRSLEGDI ASPW**RT T>n 

SPRMRRSGT/ATLNLPLSPQGTVRTAVEFQVM 

TQTQSLSFLLGSSASLDCGFSMAPGLDLISVE 

WRLQHKGRGRGDLHLPDHHLSVPSSADHPA 

QQPSQFNGRNLYFLPLFR 


1116 


2466 


A 


9135 


48 . 


410 


SASHEPAEHDGGADSLSA5QPPRPAORPAGA 

QHVHVPPWTDVLAGQDRRAPTAGDGAPWP 

APGGHVPSTRPHDPAEFHADEAAGRGGRGLQ 

PAAPHALPAGLPHGPPAPA/PAEGGGTP*GSA 

GAGGP* GSPAGRACGAAGCRPRPPRPAASS A 

*NSAGS*GLVEGT*PPGAGHGAPSPAVGARLS 
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possible nucleotide deletion, \=possibIe 
nucleotide insertion 














CPARTSVQGOTWTC*APAGRPAGLGGWEAli 
RESAPPSCSAGS*DAD*GAEPWGAGSRSWGS 


1117 


2467 


A 


9141 


380 


939 


KSGHWAKECLQPRIPPRPCPICVGPHWKSDCP 

TCPGAVPRAPGTLPQGSLTDSFPDLLSLVAED 

* CCLMASEAS WTTREL WVTLTVEGKS VP/CL 

NTEATHSTLPSFQGPVSLASITWGIDGQASKP 

LKTPQLWCQLGQYSFMHYFLVIPTCPVPLLG* 

GILTKLSAFLTIPRLQPHLIAALSPSS 


1118 


2468 


A 


9154 


471 


2 


AAGQWVEVTSHLYLC1TSDAAGLRLLPPAES 
ERGEGGHCPAEAPLPPRPQYCLAKHPLLRKLP 
EEKIKLDPYLTQHTKINSKQIKYLS/VRAKTTQ 
LVEGNIG VNLQNTELKQH* INGFLDTTPE AQE 
TKEKTNKLNFIKKVKRQLAEWEKIFQIA 


1119 


2469 


A 


9155 


2 


3187 


ACPRLARRRRRVRSUIRRRGWLRARWSRGQ 

NNMAARRTTQETFDAVLQEKAKRYHMDASG 

EAVSETLQFKAQDLLRAVPRSRAEMYDDVHS 

DGRYSLSGSVAHSRDAGRESLRSDVFSGPSFR 

SSNPSISDDSYFRKECGRDLEFSHSNSRDQVIG 

HRKLGHFRSQDWKFALRGSWEQDFGHPVSQ 

ESSWSQEYSFGPSAVLGDFGSSRLIEKECLEK 

ESRDYDVDHPGEADSV/LRGGSQVQARGRAL 

NI VDQEGSLLGKGETQ GIXTAKGGVGKLVTL 

RNVSTKKIPTVNRTTPKTQGTNQIQKNTPSPD 

VTLGTNPGTEDIQFPIQKIPLGLDLKNLRJLPRR 

KMSFDnDKSDVFSRFGIEIIKWAGFHTIKDDlK 

FSQLFQTLFELETETCAKMLASFKCSLKPEHR 

DFCFFTIKFLKHSALKTPRVDNEFLNMLLDKG 

AVKTKNCFFEIIKPFDKYLMRLQDRLLKSVTP 

LLMACNA YEL S VKMKT1JSNPLDLALALETTN 

SLCRKSLALLGQTFSLASSFRQEKIL*AVGLQ 

DIAPSPAAFPNFEDSTLFGREYIDHLKAWLVS 

SGCPLQVKKAEPEPMREEEKMIPPTKPEIQAK 

APSSLSDAVPQRADHRWGTIDQLVKRVIEGS 

LSPKERI1XKEDPAY WFLSDENSLEYKYYKL 

KLAEMQRMSENLRGADQKPTSADCAVRAML 

YSRAVRNLKKKLLP\WQRRGLLRAQG\LRG\ 

WKARRANTTGTQTLLFLRAPGLKHHGRQAPG 

LSQAKPSLPDRNDAAKDCPPDPVGPSPQDPSL 

EASGPSPKPAGVDISEAPQTSSPCPSADIDMKT 

METAEKLARFVAQVGPEIEQFSIENSTDNPDL 

WFLHDQNSSAFKFYRKKVFELCPS1CFTSSPH 

NLHTGGGDTTGSQESPVDLMEGEAEFEDEPP 

PREAELESPBVMPEEEDEDDEDGGEEAPAPG 

GAGKSEGSTPADGLPGEAAEDDLAGAPALSQ 

ASSGTCFPRKRISSKSLKVGMIPAPKRVCLIQE 

PKGECPPVGTVASSTVLGWWAVRVRRDRWR 

HFNPKEFCAPLQNVSRHSCFPVV 


1120 


2470 


A 


9163 


124 


207 


PPRACRPCPRACPCPPT'KCSQPVSWPC 


1 121 


2471 


A 


9166 


272 


523 


PMSSLQGCFYTFKCIIFKGIFLLLISNLIAF**EK 
V/CSHTTDSLKFIGKGWVGMVIHACNPGTLG 
G* GG WIA* VREFETSLGNM 


1122 


2472 


C 


9170 


442 


236 


MNRRRFLRPADCHSGMRGTENGACSEGESQI 
HCGAGGEGVQLVHWNQPENGCLQFDSTHIT 
FSKRQN* 


1123 


2473 


A 


9171 


10 


423 


MVDRSPLLTSV11FYLAJGAA1FEVLEEPHWKE 
AKKNYYTQKIJILUtEFPCLGQEGLDKILEVV 
SDAAGQGVAITGNQTFNNWNWPNAMIFAAT 
VnTIGYGNVASKTPGGRLFCGFYGLFGVPFC 
LTWINALGKFFG 
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Ammo acid sequence (A«Alanine OCysteine, 
D=Asparlic Acid, E^Olutamic Acid, 
^Phenylalanine, G=tflycine, H=Hlstidine, 
Msoleucine, K^Lysine, L=Leucine, 
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/=possible nucleotide deletion, \=possible 
nucleotide insertion 


iiil 


Z474 


A 


9173 


3 


374 


GPSPSU..VlJJpQEPGGTCr[?VBAG\AG^A.GMWL 
WED(^CU,LGPFSFUvlLMLlXETRr^VNACLL 
TGSLFVLLOVFSFEPVPSCRALQELKPRDRISA 
IAHRGGRHDPPENTLGAIRyQGS* ♦ WSNRR 




z47j 


A 


9179 


704 


188 


ESSSGLLFQCFQGIHVQKLTLQARPTLFSWWL 

CSKPPKETGELENAESGGDGGRRGGKQDNV 

AWWRRM\QKG\DFPWDDEDFPQSGPFGGQA 

LPMGFFYLYFRDPGREITWKHFVQYYLARGL 

VDRLE WNKQS VR VI PAPGTS SE VRGEFKAE 

YCRHKFISCKNWFYFFQ 


1126 


2476 


A 


9183 


153 


233 


MEYMAESTDRSPGHILCCECGVPISPN 


1127 


2477 


A 


9185 


1 


321 


LTGQLGS1LLRVFSKSRAGLGARKLKAYRTM 
EYMAESTDRSPGHILCCECGVPISPNPAQY\CV 
ACLRSSFHIYHCIPKLFIHPFSKTSSSAFmPSHY 
LTFFSTIS 


1128 


2478 


A 


9186 


183 


847 


VLKFLLLQTMDEQSQGMQGPPVPQFQPQKAL 

RPDMGYNTLANFRIEKKIGRGQVFSEVYRAAC 

L\LDGVPVALKKVQIFDLMDAKARADCIKEID 

LLKQLNHPNVIKYYASFIEDNELNIVLELADA 

GDLSRNOKHFKKQKRLIPERTVWKYFVQLCS 

ALEHMHSRRVMHRDIKPANVFITATGVVKLG 

DLGLGRFFS SKTTAAHSLVGTPYYMSPERIHD 

NG 


1129 


2479 


A 


9190 


1 


370 


GTSWKIPSAAVSESSPNGAAYASGLPCGVRG 
PPWAGLALLPSPTLMALLRRPTVSSDLDNIDT 
RATTAKIRVVATITRARIEDMRHSATALTRPD 
ATTAQ1PKLPVTTVCNRRANPGIPPSVL 


1130 


2480 


A 


9194 


131 


487 


AYLKRLPVPESITGFARLTVSEWLRLLPFLGV 
LALLGYLAVRPFLPKKKQQKDSLINLKIQKEN 
PKWNEINIEDLCLTKAAYCRCWRSKTFPAC 
DGSHNKHNELTGDNVGPLILKKJKE 


1131 


2481 


A 


9201 


184 


605 


KELVDEKSERGRAMDPVSQLASAGTFRVLKE 
PLAFLRALELLFArFAFATCGGYSGGLRLSVD 
CVNKTESNLSIDIAFAYPFRLHQVTFEG\PTCE 
GKERHKLALIGDSSSSAEFFGTVAGFAFLYSL 
AATGVY1FFQNKY 


1132 


2482 


A 


9206 


1 


852 


GGGRAGAGSRDMGSTDSKLNFRKAV1QLTTK 

TQPVEATDDAFWDQFWADTATSVQDVFALV 

PAAEIRAVREESPSNLATLCYKAVEKLVQGA 

ESGCHSEKEKQIVLNCSRLLTRVLPYIFEDPD 

WRGFFWSTVPGAGRGGQGEEDDEHARPLAE 

SLLLAIADLLFCPDFTVQSHRRSTVDSAEDVH 

SU)SCEYTV^AGVGFAHSPQPbTYIHDMNRME 

LLKLLLTCFSEAMYLPPAPESWQH/R1HWFSS 

FVSSENRHALPLFTSLLNTVCAYDPVEYGIPY 

NHLY 


1133 


2483 


A 


9208 


1165 


1463 


GPRARVQGFSGADIVKFMALGSMYLVLTLIV 
AKVLRGAEPCCGPLKNRVLRPCPLPATLPPP 
HrQF&K^rYGCLPTYKVVYKLLSWPL^ 
NVYFIV 


1134 


2484 


A 


9210 


66 


1586 


MAGAGPKRRALSAPVAEEKEEAREKIMAAK 

RADGAAPAGEGEGVTLQGNI71XKGVAVIW 

AIMGSGIFVTPTGVLKEAGSPGLALWWAAC 

GVFSIVGALCYAELGTTISKSGGDYAYMLDV 

YGSLPAFLKLWIELLIIRPSSQYrVALVFATYL 

LKPLFPTCPVPEEAAKLVACLCVULTAVNC 

YSVKAATRVQDAFAAAKUJVLALIILLGFVQI 

GKGDVSNLDPNFSFEGTKLDVGNIVLALYSG 

LFAYGGWNYLNFVTEEMINPYRNLPLAinSLP 
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IWLVY VLTNL A YFTTLSTEQML S SE AV A VDF 

GNmOVMSWIIPVFVGLSCFOSVNGSLFTSS 

RLFFVGSREGHLPSILSMIHPQLLTPVPSLVFT 

CVMIXFYAFSKX>IFSVn^FFSFFNWLCVALAn 

GMIWLRHRKPELERPIKVNLALPVFFILACLF 

LIAVSFWKTTPWSVASDFniLSGLPVYFFGV 

WWKNKPKWAPPGHLSPRPSCVRSSCMWPQ 


1135 


2485 


A 


9216 


40 


410 


RDRLPPAYFCRPWCWTALDVG\SPESQEM 
DLVAFEDVAVNFTQEEWSLLDPSQKNLYREV 
MQETLRNLASIGEKWKDQNIEDQYKNPRNNL 
RSLLGERVDENTEENHCGETSSQIPDDTLNK 


1 \1A 
I I JO 


Z4-00 






a 
J 




RRRRRSRYRRCSRFPRWrPLAVSMPHAFKPG 

DLVFAKMKGYPHWPARJDDIADGAYKPPPN 

KYPIFFFGTHETAFLGPKDLFPYDKCKDKYGK 

PNKRKGFNEGL WEIQNNPHAS YS APPPVS SSD 

SEAPEANPADGSDADEDDEGNRGVMAVTAVT 

ATAASDRMESDSDSDKSSDNSGLKRKTPALK 

MSVSKRARKASSDLDQASVSPSEEENSESSSE 

SEKTSDQDFTPEKKAAYRAPRRGPLGGRKKK 

APSASDSDSKADSDGAKPEPVAMARSASSSSS 

SSSSSDSDVSVKKPPRGRKPAEKPLPKPRGRK 

PKPERPPSSSSSD 


1137 


2487 


A 


9229 


21 


239 


LFPRLECRDPVTVNCTLNLPGSKNAPTTASQV 
GSTWNYRGGLPHPTNFFVKTGFRCSQAGLBCL 
RGSREPPAWA 


1138 


2488 


A 


9231 


1664 


2 


TRSVGVNTCEVGWTEPECLGPCEPGTSVNL 

EGIVWHETEEGVLWNVTWRNKTYVGTLLD 

CTKI-IDWAPPRFCESPTSDLEMRGGRGRGKR 

ARSAAAAPGSEASFTESRGLQNKNRGGANGK 

GRRGSLNASGRRTPPNCAAEDIKASPSSTNKR 

KNKPPMELDLNSSSEDNKPGKRVRTNSRSTP 

TTPQGKPETTFLDQGCSSPVLIDCPHPNCNKK 

YKHINGLRYHQAHAHLDPENKLEFEPDSEDK 

ISDCEEGLSNVALECSEPSTSVSAYDQLKAPA 

SPGAGNPPGTPKGKRELMSNGPGSHGAKAGK 

NSGKKKGLNNELNNLPVISNMTAALDSCSAA 

DGSLAAEMPKLEAEGUDKKNLGDKEKGKK 

ANNCKTDKN\PSKLKSARP1APAPAPTPPQLIA 

IPTATFTTTrTGTffGLPSLTTTVVQATPKSPPL 

KJPIQPKPT1MGEPITVNPALVSLKDKKKKEKR 

KLKDKEGKETGSPKMDAKLGKLEDSKGASK 

DLPGHFLKDHLNKNEGLANGLSESQESRMAS 

IKAEADKVYTFTDNAPSPSIGS 


1139 


2489 


A 


9234 


207 


443 


fRRGQPWRiUUAAAGILPGREAAACLPSC/AS 

VTAAVSGLLVGYELGIISGALLQIKTLLALSC 

HEQEMGVSSLVIGALL 


1140 


2490 


A 


9238 


248 


328 


MAQGNNYGQTSNGVADESPNMLVYRKV 


1141 


2491 


A 


9242 


2 


535 


FVEAAVKMLGSLVLRRKALAPRLLLRLLRSP 

TLRGHGGASGRNVTTGSLGEPQWLRVATGG 

RPGTSPALFSGRGAATGGRQGGRFDTKCLAA 

ATWGRXPGPEETLPGQDSWNGVPSRAGLGMX 

WPWAAALVVHCYSKSPSNKDAALLEAARAQ 

VNMQEVSRNRCALLHSAAVQEYGYGN 


1141 


2492 


A 


9245 


157 


466 


HLC)F\^FVGU^EQQLMLFATLLRMAQGCD 
FALGNDFLNITTKAQA/TKEKLDKLDFIKIKTC 
CTSMDAIEKTEPLTKWTKAFVSHVS YKRLLF 
GICKEYSRQ 


1143 


2493 


A 


9247 


264 | 115 

i 


GLPQQTSTIQPPGTPDGARDFTSTIQPPGAPDG 
ARDSTSIIRMGPEIPPP 
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1144 


2494 


A 


9260 


1 


4flt 


J^AVrvjKLobMoboLNr iLFAriTTSSPVTVDCGP 
SLGLAAGIPLLVATAIXVALLFTLIHRRRSSE 
AMEESDRPCEISEIDDNPK1SENPRRSPTHEKN 
TMGAQEAHIYVKTVAGSEEPVHDRYRPTIEM 

Em 


1145 


2495 


A 


9264 


175 


411 


METIWIYQFRLIEIGDSTVGKSCLLHRFTQGRF 
PGLRSPACDPTVGVDFFSRLLEIEPGKRIKLLL 
WDTAGQERFISIT 


1146 


2496 


A 


9277 


592 


814 


MFTYLEGREGIKSQPKMEPHSVT\RLECSGMI 
SAHCSLNLPGTSDSPASASR/VAGTTGMRHHA 
WLIFAFLVETGF 


1147 


2497 


A 


9279 


1255 


2 


FRRGRRGEEEKEEEEEEEEGWVNGMENSHPP 

HHHHQQPPPQPGPSGERRNHHWRSYKLMIDP 

ALKKGHHKLYRYDGQHFSLAMSSNRPVErVE 

DPRWGIWTKNKEXELSVPKFKIDEFYVDQV 

PPKQVTFAKLNDNIRENFLRDMCKKYGEVEE 

VEILYNPKTKKHLGIAKV\TATVRGAKDAVQ 

HLIISTSVMGNmiVELDTKGETRMRFmiLV 

TGRYTPQTLPVGELDAVSPIVNETLQLSDALK 

RLKDGGLSAGCGSGSSSYTPNSGGTPFSQDTA 

YSSCRLDTPNSYG/QGTPLTPRLGTPFSQDSSY 

SSRQPTPSYIJSQDPAVTFKARRHESKFTDAY 

NRRHEHHYVHNSPAVTAVAGATAAFRGSSD 

LPFGTVGGTGGSSGPPFKAQPQDSATFAHTPP 

PAQATPAPGFR 


1148 




A 




1026 


6 


IASIQNADTMPGVGLLVSHFSTLVSRQRCPNY 

ADPQNLTDVSIFLLLEVSGDPELQPVLAGLFL 

SMCLVTVLGNLLIILAISPDSHLHTPMYFFFSN 

LSLPDV\GFTSTTVPK\MIVDI\QSRSRV1SYAG 

CLTQKSLFAIFGGTEE\NMLLSVMAYDRFVAI 

CHPLYHSAIMNPCFCAFLVLLSFFFLSLLDSQL 

HSWA^QFTUKNVEJSNFVCDPSQLLKFACSD 

SIINSfflYFHKDPERQLVLAGLFLSMCLVTVL 

GNLIIILDVSPDSHLPTPMYFFLSNLSLPDIGFT 

STTVPKMTVDIQSHGRVIFYAGCLTQMSLFAIF 

(juMJsbKHAPECDGL 


1149 


2499 


A 


9303 


1 


699 


MASQEKDIFIGWGTIHUTIKPQRSFFGKLLRE 
FRLVAADRSMGRYMLFGMNLICTGFLLMWC 

CO" 1 'MOT A I 'IXOVPVT 'I'll: I'M lXir \ AT>m tmniinn 

oMN otAL loYlYJLI lrDLrbLMTCUS YW VTL 
RKPSPVYSFGFERLEVLAVFASTVLAQLGALF 
HKESAERFLEQPEIHTGRLLVGTFVALCFNLF 
TMLSIRNKPFAYVSEAASTSWLQEHVADLSR 
SLCGHPGLSSIFLPRMNPFVLIDLAGAFALCIT 

VMT TUT 


1150 


2500 


A 


9308 


797 


693 


DRSTSVTRAGVQWCSLGSLQPRTPGLLRSSCL 
SLP 


1151 


2501 


A 


9309 


205 


406 


VAHCELPVLWKWSKPTR\TAKEPPQTQQRAG 
SKTAAPPCQWSRMASEGPNIPCPGARHSDKQ 
FLICTI 


1152 


2502 


A 


9314 


$13 


504 


KPSPLITPPAVVLPPSAVLNLVMTFSSFPQVEV 
QGPLCGPRKGRLAVTIPFFGLS/LPKYMDHRR 
PPPHR\EEFFVFLAETGFHRASQAGPDLPTS/S/I 
PPTSA/FPKCWEYRSEPQCLPGCLSFSGILLDL 
GTNVSLRAA 


1153 


2503 


A 


9315 


392 


1 


HPHRPRPGFRSPARSSRPCPVLtSLLPPFPSPSP 
PADDLVKAGRDRKDPQVR/ERRLRPNPGRLG 
GPR\PRPARARS/CHQPRLTRVCPRSPPPEARA 
P APAAPARGRG APKRNRPRTDTRAPRGS SAR 
PGNS 
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nucleotide insertion 


1154 


2504 


A 


9321 


331 


433 


MPCl/QAQYGTPAPSPGPRDHSASDPLTTEFIK 
PT 


1155 


2505 


A 


9324 


180 


275 


MEEPQSDPSVEPPLSQETFSDLWKLLSENNVL 


1156 


2506 


A 


9326 


383 


619 


MISPSRTEGDPLPLPP/EGEGQEVRGFGGGPAK 

EAAQRHCRASVSILRMRRPGQGSSRPARVPL 

RGPDSHRLREPPPSPP 


1157 


2507 


A 


9327 


152 


292 


YERRGRSQGGGSHPAGAQPGORAIGAGWQS 
KEPLWEGLQRSGSPLPG 


1158 


2508 


A 


9328 


I 


430 


QELKQGPNPLAPSPSAPSTSAGLGDCNHRVD 

LSKTFSVSSALAMLQERRCLYVVLTDSRCFL 

VCMCFLTFIQALMVSGYLSSVITTIERRYSLKS 

SESGLLVSCFD1GNLVVWFVSYFRGRRRRP/ 

RVAAVGGLLDLEGGEMI 


1159 


2509 


A 


9334 


108 


383 


KGNQVNGNGNQLKRKI^ESMCPVSLTQNTVR 
LMEAGLPQKQAERADELFEAGLVTYVKLDER 
VLNALWSSVGLQWFKESDLSHLRLLEISFR 


1160 


2510 


A 


9338 


2 


430 


FVGRPRGLSDRl^DLFLAGFRVGERLRTAAM 
KRYVRILLLGEGAEHVADPVPGGRGVPRGEA 
DHTDQELREEIHKANVERVVHDVSQEATIEKI 
RTKWIPLV/RWGDHA/EGPVGIKSYLPSGRSM 
EAELPIMSQLTEIETCVEC 


1161 


2511 


A 


9341 


1 


390 


NSRVDDFVAPGLSEAGKLLGLEFPERQRLAA 
AVG/CSPMSGVISMSAPr^GKilDAriTNPTV 
DYSDNLTRLCLGLSGVFLCGAAANAIRVYLM 
QTSRQRWKRLRTSLFSSBLGQEVAFSDKAGT 
GELI 


1162 


2512 


A 


9343 


84 


837 


QGRFRAFCWQRDFLQPPGMRLSALLALASKV 

TLPPHYRYGMSPPGSVADKRKNPPWIRRRPV 

VVEPISDEDWYLFCGDTVEILEGKDAGKQGK 

WQVIRQRNWVWGGLNTHYRYIGKTMDYR 

GTMIP SEAFLLHRQ VKL VDPMDRKPTEIE WR 

FTEAGERVRVSTRSGRIIPKPEFPRADGIVPET 

WIDGPKDTSVEDALERTYVPCLKTLQEEYME 

AMGIKETRVNTRRSIGIEPGAEQLLPNFCPSLE 

G 


1163 


2513 


A 


9346 


967 


616 


DSLALSPRLECSGAISAHCNLTPPGFTPFSCLS 
LPSSWAYRCASPHPDNFFVFLVESGFHHVGQ 
AGLKLLISSDPPTSA/FPKCWDYRRDNSSAPAT 
FSSYQRNNPDLILNDTIMPNIK 


1164 


2514 


A 


9347 


3 


1099 


SSFPTCMRTVFHSNTSVSSLLHRPGHVTPQLTI 

HGGWRHHRDHTAIDEWDFNPSKFLIYTCLLL 

FSVLLPLRLDGIIQWSYWAVFAPIWLWKLLV 

VAGASVGAGVWARNPRYRTEGEACVEFKA 

MLIAVGIHLLLLMFEVLVCDRVERGTHFWLL 

VFMPLFFVSPVSVAACVWGFRHDRSLELEILC 

SVNILQrTFIALKLDRnHWPWLVVFVPLWILM 

SFLCLVVLYYIVWSLLFLRSLDVVAEQRRTH 

VTMAISW1TIVVPLLTFEVLLVHRLDGHNTFS 

YVSIFVPLWLSLLTLMATTFRRKGGNH W WF 

AIRRDF/CQDQLPQPTGKPPPPPLTDHHGEKA 

LPLQNKDRGS WP ASRGSPRLL 


1165 


2515 


A 


9362 


547 


991 


DVSIGPPLLRRPCSGREQTRSLSFPSDPESSFSP 
VPEGVRLADGPGHCKGRVEVKHQNQWYTV 
CQTGWSLRAAKWCRQLRCGRAVLTVQKRC 
TKHAYGRKPIWLSQMACSGPEPTLHDCPFRP 
LGEDTLFHYEYTSVHGRERLSAKD 


1166 


2516 


A 


9363 


201 


387 


PPOJIWTPPSGKNFFFKPFFESEFY/SSPRVECS 
GAISAHLAHCNLCLPGSSDSPASAFQVAS 


1167 


2517 


A 


9368 


707 


1087 


AVLTPCLSPCSPSRIPRPVSRPYPGRRSLSHTPP 
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Amino acid sequence (Alanine OCysteine, 
D=Aspartic Acid, E=Glutamic Acid, 
F-Phenylalanine, G=Glycine, H=Histidine, 
I«Isoleucine, K=Lysine, Leucine, 
M»Methionine, N^Asparagine, P«Proline, 
Q-GIutamine, R«Arginine, S=Scrinc, 
T=Threonine, V=Valine, W=Tryptophan, 
Y=Tyrosine, X=Unknown f *=Stop codon, 
/=possible nucleotide deletion, \=possible 
nucleotide insertion 














PRPLILYA^AP\RPAGXAFIPBSHPPPPDU,PPT 
ATPAjTTCPSLPPPPmHPTQPSmLPDPPPW 
PLPFPPPSS/RPPRPDCSTSYSPTFPPPT 


1168 


2518 


A 


9375 


511 


15 


MMLSEETSAVRPQKQTRFNGAKLVWMLKGS 

PITVTSAVnVIJvaLMM/IFSPWLATHDPNAID 

LTARLLPPSAAHWFGTDEVGRDLFSRVLVGS 

QQSILAGLVWATTGMIGSPLECLFGELGGRA 

DAIFMRVMDIMRS/IPSLVLTMEKTAALGPSL 

FNAMQASSEH 


1169 


2519 


A 


9377 


42 


410 


GNGRVAPRDPGAVASAEPGUTHDSGVNPN 
NSARRMEAMASGSNWLSGVNWLVMAYWS 
LVFVLLFIFAKRQIMRFAMKSLRGPHGPVGH 
NAPKDLKEEIDILLSRVHNIKYEP\HLLADDDA 


1170 


2520 


A 


9378 


302 


1303 


GVSGFSASVLRQRRMEDELEPSLRPRTQIQGR 

ILLLTICAAGIGGTFQFGYNLSIINAPTLHIQEF 

TNETWQARTGEPLPDHLVLLMWSLIVSLYPL 

GGLFGALLAGPLAITLGRKKSLLWNNIFWS 

AAUFGFSRKAGSFEMIMLGRLASWGVNAGV 

SMNIQP\MLPGGESAPKELRGAVAMSSAIFTA 

LGIVMGQWGLSTTAATGLRGL\AGELEELEE 

ERAACQGCRARRPWELFQHRALRRQVTSLV 

VLGSAMELCGNDSVYAYASSVFRKAGVPEA 

KIQYAnGTGSCELLTAWSVSLEGALPPPAL 

WGGTPRSFALNQFTLQKKKK 


1171 


2521 


A 


9381 


2 


412 


RGPASAQEDERARTAPLERVRARGRMTTSSA 
LFPSLLPCSWSTSNKYLAEFRAGKMSLKGTTE 
TPDKRKGLAY/IQQTDDSLIHFCWKDRTSGNV 
EDDLIIFPDDCEFKRLPQCPNGRVYVLKFKAG 
SKRLFFWMQEP 


1172 


2522 


A 


9384 


20 


355 


GWNGRSTEASPAAEAPHYPHKEIAKAAMGTQ 
CTHGGKVRPDPHDMLTTVVHKIKLFVLCHSL 
LQLCAIMISD YLKS SIYTVEKRLGLFRPTSGLL 
ASFNEVGNTALIVLESY 


1173 


2523 


A 


9393 


430 


87 


LCQCIVPGQQKETFSLNPSSATVRFYL*LSLQ 
QRKEDQ* IIL * YHLNKDCLHIFMSATTL YMKI* 
KIFVLFDFhnMFETPFYII*FIFLFSQNLKRIRQV 
IRPPISFSKINNGP 


1174 


2524 


A 


9397 


77 


374 


ERLEIGRLGGERGSGPASCLRVIDVSGMWDQ 
RLVKLALLQLLRAFYGIKVKGVRVHRDCGTF 
ESSSTLIRVS* FG VPCNALAHFGVTHF* YILDF 
LGML 


1175 


2525 


A 


9399 


66 


397 


HESSRADRDKMDTRGSTTTDADPVNKSGGT 
AKMNKWSKGKVRDKLNNLVLFDTATYDKL 
CKEVPNYKLITLAVVSERLK1PGSLARAALHE 
LLSRGLI*LVIQHIAQVIY 


11/0 


2526 


A 


9408 


2 


299 


LDLTHVLSLSISLTVTILGTTFGMVIPLLDVVY 
GERGYAQNGDF*DAQLDDYSFSCYSHAQVN 
GAPNSLTRAYDDP* VKISGLECQKVGALVE V 


1177 


2527 


A 


9416 


2 


402 


CNFLRSSRIRVHSTPAASTMPPKVDPNEIKW 
YLRCTGGEVRATSALAPKIGPLGLSSKVGVD 
FV*ATGDWmiISV^TKILLSHIFVVPPFFCF 
DHLIAFWDLQSLffLHVIFSLFITLLLFCFFSIF 


1178 


2528 


A 


9419 


142 


426 


TPLFDLWPRWLSWLETVLTSLRTRRAASGPP 

ACR1MPTTVDDVLEHGGEVHFLQKQMLYLL 

ALI*DTFAPIYVGIVFLGFTPDHRCRSPGVAEL 


1179 


2529 


A 


9420 


1450 


1655 


LSS AGTKMNLN * KN YWPGASAHACNPSTL G 

GQSRCITRSGDRDHPG*HGETPSVLKtQKISRA 

WWRAP 
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Amino acid sequence (A= Alanine 0=Cysteine, 
D=Aspartic Acid, E=GIutamic Acid, 
F=Phenylalanine, G=Glycine. H=Histidine, 
I=Isoleucine, K^Lysine, L=Leucine, 
M=Methionine, N=Asparagine, P=Proline, 
Q=Glutamine, R=Arginine, S=Serine, 
T«Threonine, V«Valine, W«Tryptophan, 
Y=Tyrosine. X=Unlcnown, *«Stop codon, 
/=possible nucleotide deletion, \=possible 
nucleotide insertion 


1180 


2530 


A 


9422 


176 


375 


HRPQTTRPDWKPRT+PQGK*GRLSSEISPASPP 
SRFSRSTKPVPFK ADPPARQKT ,TG VI ,HAPLLK 
L 


1181 


2531 


A 


9436 


2 


274 


PlAASLRMYNLQPV'lliKNLICTAFATMVETVP 
IARTELDRLTGIPHG YCFVE* AD W AT ADKCVH 
IYNGKPLPGATPLLSLQLHQLAHLGS 


1182 


2532 


A 


9442 


3 


240 


VXJKCSSKSIVLSEYCPHCMCSLSTDPKPFGQL 
SM1LK* MGAGDEK1S AMGKARVDHREL YLGL 
LYPTEDYKLTFRARH 


1183 


2533 


A 


9444 


384 


3 


LKDFQPWALHDWPLFCCCTFLLFLVLECFTR 
KGCSGWAPWLSLQCQHFGRPRWADHLRSGV 
RDQPGQYSKTTFLPKIQKXAGHSGAHL* S*LL 
ERMRWKNRLNPGGRSCSEPRWHHCTPGWAT 
ERG 


1184 


2534 


A 


9462 


391 


655 


LSGFKSLMPKIPLQYIYVRVRTTWSFCLPLDG 
RKLMLS*YSK*LT*KYNILPEYSRMTLPPGMV 
IFTTCNPSTLGGR AG WTV* AQEFET 


1 1 KS 
1 iOJ 




A" ' 


9467 


215 


566 


RCPMWQGQASRMDPAKAKDREASTCCSLA 
WWWGWECWVRALKLSSGPAGPLACWVAK 
KKSLSLSGPVYPSEKGAGLYVF*DRVSLCHPG 
WSAWQFWLTAASNSCFSLLSSWDYRCA 


1186 


2536 


A 


9468 


275 


452 


HIPQIJiTKTHYVFrRMVNKI*QIDNSKPWQR 
GG*TGILTHCW*ESKXVQPLWK1VWHYQ 


1187 


2537 




9469 


388 


3 


EVAPGPSQILPRRVTDGGDRPQFSLPGPRLPQ 
SSRGAEPCLSNCIHSPAPRKQRMGDSDQ* STP 
NPASPHPEAPQEPWDSASGSVGSFSLGRGAK 
ASS+VPGKGRGPRQGSELLAETILELFLALAN 
S 


1188 


2538 


A 


9471 


124 


397 


TMDKKNRHGNSLDMASEIHMTGPMCJLlENTr 
GRLMANPE ALKIL S AITQPM VEE AIAGL YRAC 
♦ FYLTNNLAGMKKGLCLGSTEQ AHTIGI 


1189 


2539 


A 


9480 


584 


769 


GHVQSQHFGRPRRADHLRSGDRDHPG*HDET 
PSLLKIQKISWAWWRAPWPATWEAEAEEW 


1190 


2540 


A 


0483 


463 


86 


VTV6LtLLLRGAi>RFtA6*PPSGGGPPLAPLL 
PRQHClt J QTHRHLHPEAPVKV*KT*RLFPGLR 
GASSCRRRRCNPVLAARKAGSPRSHSTRENC 
RR SRCPDTAHRRR RRGRRRNPSCVRSPRWR 


1191 


2541 


A 


9489 


1 


411 


LADALCLSAAATGAVRPGARAQPSTRRRLSP 
SVRVCCRAAAASNLLYSSCLQRHSERASEEG 
ERGSLSAKCGSLVLRGGCSSSNSHSFRRIT*EI 
MAAFVLLSYEQRPLKRPRLGPPDVYPPDPKQ 
KEEELTAVNVK 


1192 


2542 


A 


9497 


389 


161 


VSFLSMSSGHCIRSTRGSKMVSWSVIAKIQEI* 
CEEDERKMAREFLAEFMSTYVMMNIHMIVE 
KDTYSDHEEINTS 


1193 


2543 


A 


9509 


186 


1 


IAK5Q*KRWQRSGAMETLKHGWWECKLVQF 
FGKTFVNVN* S*TYVYPCDKHLLLGL YPTEM 


1194 


2544 


A 


0512 


58 


433 


PLQRSKCLTtRCLkAI^PWAWSQSPRACSSAL 
LKSSRSRASSLNVQCILQSNPQGHQRI*KQKA 
SSKGQQFRR* KEHPFMLKTLNKLRIEGT* LKI 
RRAIYDNPTAN1IVEGQKLEAFPLRTGTRQ 


1195 


2545 


A 


9515 


595 


1223 


GHGAPSFQTQVPRTP*ASWPWPAASESAPAP 

AGGGASLPVAAGSCAAAPHTEPGAPQHLLDC 

PCPLCLARFPRRPLPDTCYGPGSGRSASLAEPP 

LPRCSCAPLRSASAPQVS*CV*AVNLLPEJNL* 

PLHLLLHD*EKAWGFLFSSASHCFQGQICLLP 

APGSGPCGATARPSRGGRAGGSRARRPIPPGP 

GTRRTPSGCQNPAASGG 
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NO: of 
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seq- 
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SEQ ED 
NO: of 

pcpiius 

seq- 
uence 


Met 
hod 


SEQ 
ID NO: 
in 
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beginning 

UUCICUUUC 

location 
correspond i 

w\Q it\ Tl ret 

ng w ursi 
amino acid 
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nucleotide 
location 
corresponding 
to last amino 
acid residue 
of peptide 
sequence 


Amino acid sequence (A=Alanine C=Cysteine, 
D^Aspartic Acid, EKilutamic Acid, 
^Phenylalanine, GKJlycine, H=»Histidine, 
Msoleucine, KpLysine, LpLeucine, 
M^Methionine, N-Asparagine, P=Proline, 
Q=Glutamine, R^Argininc, S^Scrinc, 
Threonine, V=Vaiine, W=Tryptophan, 
Y=Tyrosine, X=Unknown, *=Stop codon, 
/•■possible nucleotide deletion, V=-possible 
nucleotide insertion 


1196 


2546 


A 


9518 


229 


ACQ 


Ivor J A 1 r AriiAMUr UArr ARUUlvr Lr ULG AM 
AERVAPG WDLHTPYLPRTNSRRTPHL* *EPHA 

VJ 1 1 VJ/VLiT riYj. O \J\J W JTVJVJV^ 


1197 


2547 


A 


9521 


289 


448 


IAWLSGLFFPSNQANLOF1CYKLTADSRYRG 
HAMRHLTGNTSMAIRFL*ADSRFQVQRARYE 
APNWKYKYGY* IPVDMLC 


1198 


2548 


A 


9524 


204 


1 


KNKKTTKeLSIVTLNISGPNQ*NKRHRVAEWI 
VKQEPN1CHL* ETHFPFRDTYMiCEREQKKRK 
SSYS 


1 199 


9-»A0 


A 

A 




1 TO* 


1943 


GGRFKESKLTNAGWQRNSFFIGPPKSIPWAA 
V* QRGDGKNPG VTHLNRP VGTX 


1200 


2550 


A 


9548 


186 


1 


WAEKEF*KIQHYF\n"KSQNKIJfflEHmKPr 
KA1 YDK WTSDIMLNLQKL* AFFLRVI VRQI 


1201 


2551 


A 


9549 


591 


2 


SSVVEFPRGPRSSLPPLDSTFPCGSSPNWTGGC 
GSCPSGE*LVSPGSEQRKKYSNSNVIMHETSQ 
YHVQHLATFIMDKSEAITSVDDAIRKLVQLSS 
KEKIWTQEMLLQVhIDQSLRLLDIESQEELEDF 
PLPTVQRSQTVLNQLRYPSVLLLVCQDSEQSK 
PDVHFFHCDEVEAELVHEYMESALTDCRLGK 
AMRP 


1202 


2552 


A 


9552 


428 


1 


KYGNEGHWSRQCPNPGKPIRPCPLCRGPHWK 
LDCERPPQGPLPSLPELAKTSYSDLTGLATED 
* WGPGMDAPATTIAS SKTRVTLM VAGRP VFF 
LI*YRATYSALPNFSGPTQSSQVSWGIDGQV 
SKPRATPPLFCSLHTF ' j 


1901 


9-»11 


A 




517 


738 


RRKFERKQKQ* R YREGKQYRQRDKMKE WG 
EKEKRRREKGEREERKMRHRERKGESGQRD 
TMENWRVERLTEKER 


1204 


2554 


A 


9573 


83 


415 


EDKRLRLVDGDSRCAGRV*IYHDGFWGTICD 
DGWDLSDAHVVCQKLGCGVAFNATVSAHFG 
EGSGPIWLDDLNCTGTESHLWQCPSRGWGQ 
HDCRHKEDAGVICSEFTALR 


1205 


2555 


A 


9577 


64 


424 


ARGSCPTRPRTANGRMGETKDAPQMLVTFK 
DVA VTFFREEWRQL VL VHRTL YR* GMLETC 
GLLDTLRHNVPQPDWHLLYHGTQLLIXTCRE 
VbHSPCAGDMRFXFTIvEATLTP^ 


1206 


2556 


A 


9584 


38 


476 


TLGAVLFSEVSKESSTSHSGGQLGRQNRHPKL 

SOTlTreSPIaKP'TASSQRNLGQILNMFLTAV 

NP^PLSTreWQIETKYSTKVLTGNWMjSERRK 

GU>YKHLmTHQEPPHRYLISTYDDHYNRHG 

YNPGLPFLRTWNGQKLLWL 


1207 


2557 


A 


9586 


2 


412 


LRS SPAALLRALCITTVTGTALALRSRV ATTN 
PDU CRN VLRPKYY1<L CDKAES WG1ALETVPT 
GVAVTSWAIMLTVLTLVCKGQDYNRRQKLP 
THTLCLL*EKGIFGLTFAFnGLDGSTGPTRFFL 
FGILFSICFS 


1208 


2558 


A 


9597 


122 


3 


1KNYWPGMVAHACNPSPLGGRGRWIA*AQK 

A A TIT A A ITT 

FADAWADAW 


1209 


2559 


A 


9611 


148 


558 


KSLRNVWDLLNNT WK ADR FFfHS^R W\\u K 
GDPGPTFSKMS1WTSGRTSSSYRHDEKRNIYQ 
RIRDHDLXDKRKTVTALKAGEDRAILLGLAM 
MVCS1MM*FLLGITLLRSYMQSVWTRESQCT 
LLNASITETFNC 


1210 


2560 


A 


9618 


384 


2 


SLHDMLMLAEQQQKQKWAVNTQNTAWSNA 

DSKFGQRILEKMEWSKGRGLGVQEQGGPDDI 

KVQVKNNDLGLQATOuVl^NWIAHQDDFNW 

LLAELNTCQRQETADS***WSPKNSHVGKDS 

GELSAK 


1211 


2561 


A 


9620 


316 


610 


QKHPGCKjQLGRSPQEDSRFHNKASSGVSRVR 
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seq- 
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SEQ 
ID NO: 
in 

USSN 
09/496 
914 


Predicted 

beginning 

nucleotide 
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ng to first 

amino acid 
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peptide 
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to last amino 
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sequence 


Amino acid sequence (A^Alanine C=Cystctnc, 
D=Aspartic Acid, EKHutarnic Acid, 
F=Phcnylalanine, 0=Glycine, H=Histidine, 
I=Isoleucine, K=Lysine, L=Leucine, 
M=Methionine, N=Asparagine, P=Proline, 
Q=Glutamine, R=Arginine, S-Serine, 
T«Threonine, V-Valine, W-Tryptophan, 
Y-Tyrosinc, XMLJnknown, *«Stop codon, 
/^possible nucleotide deletion, \=possible 
nucleotide insertion 














LGRAW WLTPVIPTLWEAKAGGSPE* D* AGRG 
GSRL* SQHFGRPRRVDHLRSAVQDQPGQHGE 
TPSLLKIQKIN*VWGRRL*SSYSEAEAGESL 


1212 


2562 


A 


9623 


297 


344 


QFPVDGDYQKIEKITQLFQAQNLSLCLAMTR 
TREL*KGGGKGRHE*AVVPFLKKGGYGVKAP 
AILNTSNCT* CF*ETKMLSDDPKACVFEVSS A 
DL*NTSFGVIR 


1213 


2563 


A 


9624 


2 


356 


AELSLASTACGRNTSGDSLPDYDRAPISSPLA 
TSGTDLSAISCLWDLPTPVLRVGLSCQPSMSSQ 
IPRMYSTDVEAAVNSLEDLYLQAYYAYLCVG 
LYFHRDDMALEG VSRFL* ELAE 


1214 


2564 


A 


9634 


776 


912 \ 


SLSRWVRAKL*VPYNQENCLNPRGGGCSEPR 
SHYCTPAWATEKDS 


1215 


2565 


A 


9636 


220 


426 


KPGNFAVSSEY*DITSGQLKTAVRG*IEMTST 
EENFGEKLHDIGFGNGFLDKT 4 KAQATKAKI 
DK 


1216 


2566 


A 


9637 


391 


76 


CFLEDGCTQAS * AEEAAVSPSMAEEEQGSTSC 
RERRSIRFKMKNHSPDDTIKENVTISNIRTRKI 
NHLPETERNLLEHGLMYIRLNAAFCSLVAHS 
LFGFILKAT 


1217 


2567 


A 


9655 


2008 


2432 


LHCKMGALETQTHPCSQNMLRSLQKCCCKV 
EEHHLQP VQ VLQTLLHSATAGTG CRRPARPP 
PAPPTPTPWRSRQSGKQSERAS*LKGRGRYGL 
GALGGRGGRALGGSRWPPPLPGETLFSGCKH 
RRRRRGSDAAPGEEAGT 


1218 


2568 


A 


9658 


3 


405 


HASARALLSPNLSPNNKMA1SGGPVLGFFIIA 

VLMSAQEPWADCEEHVIIQAEFYLNPDQSGEF 

MLDFEGEI7IPHGDMAKKFrVWRLE*LARLD 

NFEAQRALANIAADQAALEIMDMGSDYTLIP 

NVPPKVTVL 


1219 


2569 


A 


9662 


3 


284 


PDWTEKRKMQDTGSILPLHWFGFGYAALVA 
YGGIIGYVKAGSVPSLAAGLLFGSLSGLGAYQ 
LSQDPRNVWVFLATSGTLAGIMGMRFYI^SG 
KL 


1220 


2570 


A 


9669 


200 


699 


LLLTGYIQfTLQNQQLSGNQQEMQAVDNLTSA 

PGNTSLCTRDYKTTQVLFPLLYTVLFFVGLITN 

GLAMRIFFQIRSKSNFIIFLKNTVISDLLMILTF 

PFKILSDAKLGTGPLRTFVCQVTSVIFYFTMYI 

SI SFLGLITIDRYQKTTRPFKTSNPKNLLGAKIL 

K 


1221 


2571 


A 


9676 


164 


562 


KERDSSTFSAAMTTMQGMEQAMPGAGPGVP 
QLGNMAVIHSHLWKGLQEKFLKGEPKVLGV 
VQILTALMSLSMGITMMCMASNTYGSNPISV 
YIGYTIWGSVMFIISGSLSIAAGIRTTKGLVRG 
SLGMNTTSS 


1222 


2572 


A 


9688 


43 


412 


VAKMVKCCSAIGCASRCLPNSKLKGLTFHVF 
PTDENIKRKWVLAMKRLDVNAAGIWEPKKG 
DVLCSRHFKKTDFDRSAPNIKLKPGV1PSIFDS 
PYHLQGKREKLHCRKKFTLKTVPATNYNH 


1223 


2573 


A 


9696 


308 


564 


RTSMGILYSEP1CQAAYQNDFGQVWRWVKE 

DSSYANVQDGFNGDTPLICACRRGHVRTVSFL 

LKKECLCQPQKPERENLLALCCE 


1224 


2574 


A 


9700 


3 


632 


DAWASGGELGSLFDHHVQRAVCDTRAKYRE 

GRRPRAVKVYTTNLESQYLUQGVPAVGVMK 

ELVERFALYGA1EQYNALDEYPAEDFTEVYLI 

KFMNLQSARTAKRKMDEQSFFGGLLHVCYA 

PEFETVEETRKKLQMRKAYVVKTTENKDHY 

VTKKKLVTEHKDTEDFRQDFHSEMSGFCKA 

ALNTSAGNSNPYLPYSCELPLCYFSSK 
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SEQID 
r*\J. 01 
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seq- 
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Met 
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SEQ 
in 
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Predicted 

beginning 

nucleotide 

location 

correspondi 

ng to first 

amino acid 

residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue 
of peptide 
sequence 


Amino acid sequence (A«Alanine 0=Cysteine> 
D=Asparlic Acid, E=01utaniiu Acid, 
F=Phenylalanine, G=Glycine, H^Histidine, 
I=Isoleudne, K=Lysine, L^Leucine, 
M=Methionine, N=*Asparagine, P=Proline, 
Q=Glutamine, R»Arginine, S^Serine, 
i"»inreonme, v=» valine, w*"iryptopnan, 
Y^Tyrosine, X=Unknown, *=Stop codon, 
/—possible nucleotide deletion, possible 
nucleotide insertion 


1225 


2575 


A 


9710 


1 


lti 


RSGCVLRMTEWETGAPAVAETPDIKLFGKWS 
I UUVHlNJJlbJUyD YIAU VRL1LL 


1226 


2576 


A 


9713 


82 


492 


QGLPSFLPAFGPSGSWLGPAPTLGSSCNTVDT 
llnu i olSlKrLr YLSrUDLLLuLCWLTElLLYG 
ASVA^^^^CY^nLQAVGQIFYlSSFLYTVNYI 
WYLYTELRMKHTQSGQSTSPLVIDYTCRVCQ 
MAFVTSSLI 


1227 


2577 


A 


9720 


3 


416 


GKWKRTQVPLLGEECADMDLARKEFLRGNG 
LAAGKMNISIDLDTNYAELVXNVGRVTLGEN 
NRKKMKDCQLRKQQNENVSRAVCALLNSGG 
GVIKAEVENKGYSyKKDGIGLDLENSFSNML 
PFVPNFLDFMQNGNYF 




4 J /o 


A 

A 




278 


411 


EASSSNTVASNVADKTDFHSMNSRVFIGNLN 
TLVLQKSDVEAVF 


1229 


2579 


A 


9725 


121 


902 


LFAMSGFENLNTDFYQTSYSIDDQSQQSYDY 
GGSGGPYSKQYAGYDYSQQGRFVPPDMMQP 

QQPYTGQIYQPTQAYTPASPQPFYGNNFEDEP 

PLLEELGINFDHIWQKTLTVLHPLKVADGSIM 

NETDLAGPMVFCLAFGATLLLAGKIQFGYVY 

GISAIGCLGMFCLLNLMSMTGVSFGCVASVL 

GYCLLPMILLSSFAVIFSLQGMVGIILTAGIIG 

WCSFSASKIFISALAMEGQQLLVAYPCALLYG 

VFALISVF 


1230 


2580 


A 


9739 


11 


247 


TFVLNMNTPKEEFQDWPIVRIAAHLPDLIVYG 
HFSPERPFMDYFDGVXMFVDISGKCKRDVCL 
MWMSNRLAWEFTCRA 


1231 


2581 


A 


9744 
• 


37 


1100 


TPLFDFWPGFVLSWLQPLSASLRARRAASGPP 

ACRIMPTTVDDVLEHGGEFHFFQKQMFFLLA 

LLSATFAPIYVGIVFLGFTPDHRCRSPGVAELS 

LRCGWSPAEELNYTVPGPGPAGEASPRQCRR 

YE VD WNQ STFDCVDPL ASLDTNRSRLPLGPC 

RDGWYYETPGSSIVTEFNLVCANSWMLDLFQ 

SSVNVGFFIGSMSIGYIADRFGRKLCLLTTVLI 

NAAAGVLMAISPTYTWMLIFRLIQGLVSKAG 

WLIGYILITEFVGRRYRRTVGIFYQVAYTVGL 

LVUGVAYALPHWRWLQFTVALPNFFFLLY 

Y Yv ClrfcbrK WLISQNjKNAEAMRIIKHIAKKNG 

KSLPASL 


1232 


2582 


A 


9753 


164 


517 


PGPGMQGPPPITPTSWSLPPWRAYVAAAVLC 
YINLLNYMNWFUAGVLLDIQEVFQISDNHAG 
LLQTVFVSCLLLSAPVFGYLGDRHSRKATMS 
rulJULWDuAuLbborlbrKYbWLF 


1233 


2583 


A 


9757 


25 


419 


LPAPWTERVRKSEGLVGTCLGDPMASPRTVT 
IVALSVALGLFFVFMGTIKLTPRLSKDAYSEM 
KMYKSYVRALPLLKKMGINSILLRKSIGALE 
VACGIVMTLVPGRPKDVANFFLLLLVXAYLF 
FHQLV 


1234 


2584 


A 




71 
/ 1 


430 


KLbLD Wur aLlif I^VAYLCPLSSGFEMNVQP 

CSRCGYGVYPAEKISCIDOIWHKArFHrFVr 

KMMLSVTsOTVSHQKKPYCHAHNPKNNTFTS 

VYHTPLNLNVRTFPEAISGIHDQEDGEQCKSV 

FHWD 


1235 


2585 


A 


mi 


52 


559 


IRSGAMSVDKAELCGSLLTWLQTFHVPSPCA 

SPQDLSSGLAVAmNQIDPSWFNEAWLQGI 

SEDPGPNWKLKVTSGLLIRGQTGEEMTRDGP 

ARHMSWVMGRKRDRCLVINHIJIHSSMEYSP 

CARPGHSARNNTDKNLPHTAin^VTSNTYTn 

KJNFQAGRSGSCL 


1236 


2586 


A 


9770 


352 


608 


FRGEALmFLTKRFIGEYASrtoSIYKKhLC 
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LERKQLNLEIYDPCSQTQKAKFSLTSELHWA 
DGFVIVYD1SDRSSFAFAKAL1 


1237 


2587 


A 


9793 


266 


515 


NILAEYFPFPRLFLLRDSQSNPKAFALTLCHH 
QKIKNFQILPVSIDALTPPLWCFLVSFLTHFS 
RYKPTRPVOTQFQGCS 


1238 


2588 


A 


9802 


537 


967 


ELGAGRSDREAMEAAVKEEISVEDEAVDKNI 
FRDCNK1AFYRRQKQWLSKKSTYRALLDSVT 
TDEDSTRFQIINEASKVPLLAEIYGIEGNIFRLK 
INEETPLKPRFEVPDVLTSKPSTVRLISCSGDT 
GSLIL ADGK.GDLKC 


1239 


2589 


A 


9805 


105 


540 


VPGDPAMVRAGAVGAHLPASGLDIFGDLKK 

MNKRQLYYQVLNFAM1VSSALMIWKOLIVLT 

GSESPIVWLSGSMEPAFHRGDLLFLTNFRED 

PIRAGEIVVFKVEGRDIPIVHRVIKVHEKDNG 

DKFLTKGDNNEGDDRGSYK 


1240 


2590 


A 


9819 


3 


305 


TDGRDPLPCAARRRGGGGECCGAGWVAEWS 
PQPLDPAMLLWMQGFVLEAVACQDNDDYLR 
YGHJFEDLDCNGDGWDIIELQEGLRNWSSAF 
DPNSEEHG 


1241 


2591 


A 


9834 


841 


1209 


SPARGKWRTDVM1TAPKNKKMTENLAAPEA 
LDSSTHSSSTATQSRAKMNTPAPTPSTVPAJOPR 
GGSGGPPPCAPHDRVS S VLQCDTQAMDHKTE 
SSHSVVEFLFKRTKTPSPFHPAVRENRN 


1242 


2592 


A 


9843 


3 


589 


TISCGPATEPPASLLSSAS SDDFCKEBCTEDRYS 

LGSSLDSGMRTPLCRICFQGPEQGELLSPCRC 

DGSVKCTHQPCLIKWrSERGCWSCELCYYKY 

HVIAISTKNPLQWQAISLTVIEKVQVAAAILGS 

LFLIASISWLIWSTFSPSARWQRQDLLFQ1CYG 

MYGFMDVMIVAVDSEDMVQAAKEVGKRWS 

DIPP 


1243 


2593 


A 


9846 


198 


411 


WRISHHAGKMPVMKGLLAPQNTFLDTIATRF 
DGTHSNFILANAQVAKGFPIVYCSDGFCELAG 
FARTEVMQ 


1244 


2594 


A 


9848 


116 


650 


PICGFLYLCSAMASESSPLLAYRLLGEEOVAL 

PANGAGGPGGASARKLSTFLGVWPTVLSMF 

SIWFLRIGFWGHAGLLQALAMLLVAYFILA 

LTVLSVCAIATNGAVQGGGAYCILQHRWTG 

VWPVLPAREVMISRTLGPEVGGSIGLMFYLA 

NVCGCAVSLLGLVESVLDVFGA 


1245 


2595 


A 


9849 


573 


1620 


KSKCRFPEGLSEGFGPMRKEALSSGSVQEAE 

AMIJ5EPQEQAEGSLTVYV1SEHSSLLPQDMM 

SYIGPKRTAWRGIMHRE'VFNUGRRIVQVAQ 

AMSLTEDVLAAALADHLPEDKWSAEKRRPL 

KSSLGYEITFSLLNPDPKSHDVYWDIEGAVRR 

YVQPFLNALGAAGNFSVDSQILYYAMLGVNP 

RFDSASSSYYLDMHSLPHVINPVESRLGSSAA 

SLYPVLNFLLYVPELAHSPLYIQDKDGAPVAT 

NAFHSPRWGG1MVYN\T>SKTYNASVLPVRV 

EVDMVRVMEVFLAQLRLLFGIAQPQLPPKCL 

LSGPTSEGLMTWELDRLLWARSVENLATATT 

TLTSLA 


1246 


2596 


A 


9850 


114 


464 


PPQLGAQRVREPRHPDVRAPLRVTSPGLRSRS 
ARSLGRRPRIAMVTVGNYCEAEGPVGPAWM 
QDGI^PCFFFTLVPSTRMALGTLALVLALPCK 
RRERPAGADSLSWGAGPRISSYV 


1247 


2597 


A 


9851 


2 


327 


FVRNKJKMTRSCSAVGCSTRDTVLSRERGLSF 
HQFPTDTIQRSKWIRAVNRVDPRSKKIWIPGP 
GAILCSKHFQESDFESYGIRRKLKKGAVPSVS 
LYKVFKYSSRCTS 
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nucleotide insertion 


1248 


2598 


A 


9853 


5& 


444 


RVDDFVYSKGGKDAGGADVSLACRRQSlPEE 
FRGITWELIKKEGSTLGLTISGGTDKDGKPR 
VSNLRPGGLAARSDLLNIGDYIRSVNGIHLTR 
LRHDEUTLLKNVGERWLEVEYELPPPGGCP 
WT 


1249 


~259§ 


A 


9856 


| 2 

! 


1265 


LPPPRPSRHRRGRAGTRASAAAAAGPTVSAV 

RAPVRGQDSGAGTPQGRUGRGAHLSRVGA 

SGSGVAAGPAARIIAPRRRCADAGEAVGASC 

UKCAYALLSGVCILVSTHVCVGSGCPGAAGT 

PMGAGDAGASAESAVTTAPQEPPARPLQAGS 

GAGPAPGRAMRSTTLLALLALVLLYLVSGAL 

VFRALBQPHEQQAQREL'GEVREKFLRAHPCV 

SDQELGLLKEVADALGGGADPETNSTSNSSH 

SAWDLGSAFFFSGTIITTIGGGGDWHVGGGK 

ELPHGGRCRETEGSQVAPRLPASPLCPGYGN 

VALRTDAGRLFCIFYALVGIPLFGHLAGVGD 

KUjdoLKHuHjJiUiAlrLK 

MLFLLIGCLLFVLTPTFWCYMEDWSKLEAIY 

FVIVTLTTVGFGDYVA 


1250 


2600 


A 


9873 


2 


652 


FWPSPCGGIPGRAPNGASRPTMGNSASRNDF 

EWVYTDQPHTQRRKEILAICYPAIKALMRPDP 

RLKWAVLVLVLVQMLACWLVRGLAWRWLL 

FWAYAFGGCVNHSLTLAIHDISHNAAFGTGR 

AARNRWLAVFANLPEGVPYAASFKKYHVDH 

HRYLGGDGLDVDVPTRLEGWFFCTPARKLL 

WLVLQPFFYSLRPLCVHPKAVTRMEVLNTLV 

QLA 


1251 


2601 


A 


0R7 4 ? 


uu 




JP VuWLHFbPtfDl VRPSCCVSSSPKLRRNAHSR 
LKSYRPDTDLSREDTGCNLQHISDREN1DDLN 
MEFNPSDHPRASTIFLSKSQTDVREKRKSLFIN 
HHPPGQIARKYSSCSTTFLDDSTVSQPNLKYTI 
VALAl i YHIKJNIRDPDGRMLLDIFDENLHPL 
SKSEVPPDYDKHNPEQKQIYRFVRTLFSAAQL 
TAECAIVTLVYLERLLTYAEIDICPANWKRIV 

L>\jj\li^l^j\oS\.y WUUyAV WIN YJJ I Cv^lLKl/l 1 Vil 

DMNELERQFLELLQFNINVPSSVYAKYYFDL 

RSLAEANNLSFPLEPLSRERAHKLEAISRLCED 

KYKDLRRSARKRSASADNLTLPRWSPAIIS 


1252 


2602 


A 


9879 


6 


376 


KRPDSRPPAQYRAGPTRPRTRGCELLYWKAT 

DNIFFSSLSLLYALSMVLLGARGETEEQLEKV 
WNSSEVCSEPRSLSCSRSGSAKLILSLYQ 


1253" 


2603 


A 


9880 


180 


m 


KEQAELLYGLYCQCDLTLSSHPSSVPAMSSC 

"NFTHATFV1 If»n>f»T FTf AHFWVnrrDT t ck/iwa 
nc i lirt. ir v jliult uJjEJVArir W Y ur r LLoJvl Y V A 

AMFGNC 


1254 


2604 


A 


9881 


19 


494 


VISFQIITDllMDSSTAHSPVFLVFPPEITASEYE 
STELSATTFSTQSPIXJKLFARKMKILGTIQILF 

S\^FINSGAFLIAVKRKTTETLIILSRIMNFLSA 
LGA1AGIILLTFEFHPRSKLHL 


T255 


2605 


A 


9896 


72 


386 


RPGREQRDCFQAPPLGLGGRQTDMMHHPLT " 
GATCVGLPNVGMCPQLSGALTFMYLQQGNQ 
EATVAPDTMAQPYASAQFAPPQNGIPGEYTA 
PHPHPAPEYTGQTT 


1256 


2606 


A 


9902 


95 


399 


SGGPAGLLHRPVLPKMGLSGLLP1LVPFILLG 
DIQEPGHAEGILGKPCPKIKVECEVEE1DQCTK 
PRDCPENMKCCPFSRGKKCLDFRKVSLTLYH 
KEELE 


1257 


2607 j 


A 


9905 


374 


459 


EHLKSTPNRLGVVAHTCNPSTLGGRGGW 
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Q^Iutamine, RpArginine, S=Serine, 
T=*Threonine, V-Valine, W~Tryptophan, 
Y^Tyrosine, X^nknown, *=Stop codon, 
^possible nucleotide deletion, Nppossible 
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1258 


2608 


A 


9911 


364 


1$74 


AGPGVPAVGGRWASGPGLGGRTLCSGPPDH 

QRRGPSCGASGDPQCVGSFHPQRARPLLARP 

GARLLPGHLPSPRPPRLPTGQPPAAAFRGPVR 

PQGGGHHiPLPTPGORPCFAVSEGSGSALLLS 

YLGECGS SSYVTGAACISPVLRCREWFEAGLP 

WPYERGFLLHQKIALSRYATALEDTVDTSRL 

FRSRSLREFEEALFCHTKSFPISWDAYWDRND 

PLRDVDEAAVPVLCICS ADDPVCGPPDI ITLTT 

ELFHSNPYFFLLLSRHGGHCGFLRQEPLPAWS 

HEVILESFRALTEFFRTEERIKGLSRHRASFLG 

GRRRGGALQRRE VSSSSNLEEIFNWKRS YTRL 

MAAAAGAAAAPGSREPQDRPECGAGHPGPR 

YYRHPERWLLRPEAFLGPLRTRAPSAEDSQR 

ERPAARSGPEMRVRYPWAAVLAPYLALSQD 

PMYKSSASGQGASGSYNHVREEMLIKAGGA 

MSRRVVRQSKFRHVFGQAAKADQAYEDIRV 

SKVTWDSSFCAVNPKFLAIIVEAGGGGAF1VL 

FLAK 


1259 


2609 


A 


9919 


693 


935 


GCFKFIGESTCCWIFPSSYTTQCWAKAPRAA 
TLSKAERLRSQPGPEQGGSSYRPRTPTAAAIL 
PPRPGRSHRKRKLVSTK 


1260 


2610 


A 


9921 


455 


1082 


QRSCLCSAIEKDGGDVKALYRRSQALEKLGR 

LDQAVLDLQRCVSLEPKNKVFQEALRNIGGQ 

IQEKVRYMS STDAK VEQMFQILLDPEEKGTE 

KKQKASQNLWLAREDAGAEK1FRSNGVQLL 

QRLLDMGETDLMLAALRTLVGICSEHQSRTV 

ATLSILGTRRWSILGVESQAVSLAACHLLQV 

MFDALKEGVKKGFRGKEGAITV 


1261 


261 i 


A 


9928 


1 


438 


GFRGAEAPGAAQAPKKKKPRPTEGGPGAGSG 
RGKDPYRGPTLLHQPKPPKDEFLS SLES YEI AF 
PTRVDHNGALLAFSPPPPQRQRRGTGATAES 
RLFYKEASPSTHFLLNLTRSSRLLAGHVSYEY 
WTREGLAWQRADRPHCLYA 


1262 


2612 


A 


9931 


168 


435 


AAEMGRAGAAAVIPGLAELWAVGLGGPPPA 
PPRLPFCLQELQGRHALHTFSLERTCSYQDFL 
WADEGRLLHVGAQDLATWHTLSPLGLW 


1263 


2613 


A 


9938 


247 


488 


RMSATSVDQRPKGQGNKVSVQNGSIHQKDG 
CNDDDFEPYLRSPDNQSNSYPPMSDPYMPGY 
YAPSIGFPYSLGEAAWSQL 


1264 


2614 


A 


9941 


61 


17/ 


ES1GLTALGPRRRPWEHRWSDPITLKMKGWG 
WIAIJXGAIXGTAWARRSQDLHCGACKAVR 
RRVRQFNIYDY 


1265 


2615 


A 


9956 


2 


522 


FVASEVSKMPVPASWPHPPGPFLLLTLLLGLT 

EVAGEEELQM1QPEKLLLVTVGKTATLHCTV 

TSLLPVGPVLWFRGVGPGRELIYNQKEGHFP 

RVTTVSDLTKRNNMDFSIRISSrTPADVGTYY 

CVKFRKGSPDHVEFKSGAGTELSVRGEYSVG 

FLSQVWWWLSSHPFMN 


1266 


2616 


A 


10002 


243 


387 


PKNNACHLLFTAVCQPRCKHGECIGFNKCKC 
HPGYAGKTGNQGRKTV 


1267 


2617 


A 


1O0O4 


36 


707 


LPAPAST WS VARETMAS SS VPPATVS AATAG 

PGPGFGFASKTKKKHFVQQKVKVFRAADPLV 

GVFLWGVAHSINELSQVPPPVMLLPDDFKAS 

SIGKVNNHLFHRENLPSHFKFKEYCPQVFTWL 

RDRFGIDDQDYLVSLTRNPPSESEGSDGRFLIS 

YDRTLVDCEVS SEDIADMHSNLSNYHQ VRPL S 

SPILSLSSLLTYSSAIVSNRCQLGRKLIGRENP 


1268 


2618 


A 


10005 


2 


209 - 


GEGYELFVPSNGVPAVCHMVGRRPHRAVLSP 
SQDELEHSLGESAAQGAAGWLWVSWENTR 



319 



WO 01/57188 PCT/US01/03800 



SEQID 
NO: of 
nucl- 
eotide 
seq- 
uence 


SEQID 
NO, of 
peptide 
seq- 
uence 


Met 
hod 


SEQ 
ID NO; 
in 

USSN 
09/496 
914 


Predicted 

beginning 

nucleotide 

location 

correspond]" 

ng to first 

amino acid 

residue of 

peptide 

sequence 


Predicted end 

nucleotide 
location 
corresponding 
to last amino 
acid residue 
of peptide 
sequence 


D=Aspartic Acid. E=<31utaralc Acid, 
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nucleotide insertion 














TKVSLGLA 


uv> 


2619 


A 


1UU1U 


245 


688 


FOMLK>nCGHSSKKDNLAVNAVALQDHILHD 

I Of RTvTT ^IVAHHQl^'Tnvnvi'PMSrci VDTrrv a t 
Jj VL>JKJNLO V /vL/tto-k. i Kl V l^ftJvDlN JvaLJvKIJl KA1 

1DTGLKKTTQCPKLEDSEKEYVLDPKPPPLTL 
AOKI HT TftPPPPPT Q^HPWTiWTfmJCT f nnno 

VQPCPICKEEFELRPQVFSIRG 


1270 


2620 


A 


10011 


2 


588 


RVDDmPLPPGLMSRSRASIHRGSIPAMSYA 
PFRDVRGPSTHRTQYVHSPYDRPGWNPRFCIl 

RTVSVPVEGRPHGEHEYHLGRSRJRKSVPGGK 
QYSMEGAPAAPFRPSQGFLSRRLKSS1KRTKS 
QPKJLDRTSSFRQILPRFRSADHDRYRGWSMW 
DEIDV 


1271 


2621 


A 


10013 


209 


363 


LPAPPNLSPRLSFGFQFPGGNDNYLTITGPSHP 
FLSGAEVSQSCRRRGGRA 


1272 


2622 


A 


10014 


7 


388 


SAVTISWKWRSVMGIQTSPALLASLGAGLVT ~ 
LLGLAVGSYLVRRSRRPQVTLLDPNEKDLLR 
LIDKTLSARSPCKHIYLSTRIDGSLSIRPYTPVT 
SDEDQGYVDIDIKVYLKGVHPTFPEGGKMSH 


1273 


2623 


A 


10016 


1 


1339 


MAARTLGRGVGRLLGSLRGLSGQPARPPCGV 

SAPRRAASGPSGSAPAVAAAAAQPGSYPALS 

AQAAREPAAFWGPLARDTLVWDTPYHTVW 

DCDFSTGKIOWFLGGQLNVSVNCLDQHVRKS 

rtb VALi w ERDcrGTEVRITYRELLETTCRLA 

NTLKRHGVHRGDRVAIYMPVSPLAVAAMLA 

CARIGAVHTVIFAGFSAESLAGRINDAKCKW 

ITFNQGLRGGRWELKK1VDEAVKHCPTVQH 

VLVAHRTDNKVHMGDLDVPLEQEMAKEDP 

VCAPESMG SEDMLFMLYTSGSTGMPKGI VHT 

QAGYLLYAALTHKLVFDHQPGDIFGCVADIG 

W1TGHSYVVYGPLCNGATSVLFESTPVYPNA 

GRYWETVERLKINQFYGAPTAVRLLLKYGD 

AWVKKYDRSSLRTLGSVGEPINCEAWEWLH 

RWGDSRCTLVDTWWQT 


1274 


2624 


A 


10017 


1 


3750 


FRPQGTPRSPASHVLTMSAPDEGRRDPPKPKG 

KTLGSFFGSLPGFSSARNLVANAHSSARARPA 

ADPTGAPAAEAAQPQAQVAAHPEQTAPWTE 

KELQPSEKNIVSGAKDLVCSKMSRAKDAVSS 

GVASWDVAKGWQGGLDTTRSALTGTKEV 

YSSGVTGAMDMAKGAVQGGLDTSKAVLTG 

TKDTVSTGLTGAVNVAKGTVQAGVDTTKTV 

LTGTKDTVTTGVMGAVNLAKGTVQTGVErS 

KAVLTGTKDAVSTGLTGAVNVARGSIQTGV 

DTSKTVLTCTKDTVCSOVTGAMNVAKGT1QT 

GVDTSKTVLTGTKDTVCSGVTGAMNVAKGT 

1QTGVDTSKTYLTGTKDTVCSGVTGAMNVA 

KGTIQTGVDTTKTVLTGTKNTVCSG VTGAVN 

LAKEAIQGGLDTTKSMVMGTKDTMSTGLTG 

AANVAKGAMQTGLNTTQNIATGTKDTVCSG 

VTGAMhnLARGTIQTGVDTTKIVLTGTKDTVC 

SGVTGAANVAKGAVQGGLDTTKSVLTGTKD 

AVSTGLTGAVNVAKGTVQTGVDTTKTVLTG 

TKDTVCSGYTSAVNVAKGAVQGGLDTTKSV 

VIGTKDTMSTGLTGAANVAKGAVQTGVDTA 

KTVLTGTKDTVT1 GLVGAVN VAKGTVQTGM 

DTTKTVLTGTKDTIYSGVTSAVNVAKGAVQT 

GLKTTQN1ATGTKNTFGSGVTSAVNVAKGAA 

QTGVDTAKTVLTGTKDTVTTGLMGAVNVAK 

GTVQTSVDTTKTVLTGTKDTVCSGVTGAAN 
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VAKG AIQGGLDTTKS VLTGTKD A VSTGLTG A 

VKLAKOTVQTGMDTTKTVLTGTKDAVCSGV 

TQAANVAKGAVQMGVDTAKTVLTG1KDTV 

CSGVTGAAKVAKGAVQfTGLKTTQNIATGTK 

NTLGSGVTGAAKVAKGAVQGGLDTTKSVLT 

GTKDAVSTGLTGAVNLAKGTVQTGVDTSKT 

VLTGTKDTVCSGVTGAVNVAICGTVQTGVDT 

AIClVLSGAKDAVTrGVTGAVNVAKGTVQfG 

VDASKAVLMGTKDTVF SGVTGAMSMAKGA 

VQGGIJDTTKTVLTGTKDAVSAGLMGSGNVA 

TGATHTGLSTFQNWLPSTPATSWGGLTSSRT 

TDNGGEQTALSPQEAPFSGISTPPDVLSVGPEP 

AWEAAATTKGLATDVATFTQGAAPGREDTG 

LLATTHGPEEAPRLAMLQNELEGLGDIFHPM 

NAEEQAQLAASQPGPKVLSAEQGSYFVRLGD 

LGPSFRQRAFEHAVSHLQHGQFQARDTLAQL 

QDCFRL 


1275 


2625 


A 


10025 


124 


415 


TTI^ARKKEKTCPCKKETGRNSRSGMYSRKAM 
YKRKYSAANTKVEKXKKEKVLAPVTKPVGG 
DKNGGTRVVKLPTMPRYYPTEDVPRKLLSHG 
KKPFS 


1276 


2626 


A 


10030 


3 


507 


GGSLRFSPPRVPSCSRVFCPVPPGGCGLFSPMS 

ASRPQSPTTPWCLPRRYMKHKRDDGPEKQED 

EAVDVTPVMTCVFVVMCCSMLVLLYYFYDL 

LVYWIGIFCI^SATGLYSClj\PCVRRl.PFGK 

CRIPNNSLPYFHKRPQARMLLLALFCVAVSV 

VWGVFRNEDQ 


1277 


2627 


A 


10035 


51 


869 


YSRFTVFLPATMAS SE VARHLLFQSHMATKT 

TCMSSQGSDDEQIKRENIRSLTMSGHVGFESL 

PDQLVNRSIQQGFCFNILCVGETGIGKSTLIDT 

LFNTNFEDYESSHFCPN\nKLKAQTYELQESN 

VQLKLTIVNTVGFGDQINKEERQLXfRSQSTEN 

PQKYRSEQHPVEPKKCTSFWKGALGKWAGIE 

SSGQSAQQPYLPINSPPHRLADVADVHLFSSV 

LSGAFGCYHLDVTVNEFKKQQNRDEQEGYS 

KGDQEQGSWKHGADPLRGGEM 


1278 


2628 


A 


10036 


3 


457 


RAFDVRRKKSLRPCCPRDFHAGCLTVSGPST 
VMGAVGESLSVQCRYEEKYKTFNKYWCRQP 
CLPIWHEMVETGGSEGWRSDQVIITDHPODL 
TFTVTJLENLTADDAGKYRCGIATILQEDGLSG 
FLPDPFFQ VQ VL V S S ASSTENS VKTP 


1279 


2629 


A 


10039 


214 


435 


NDSLVPMSSWRSCARAPSSESAWRRSAATRR 
SRKCLRTKRKRWSSGKGTQMQSTLSETPRRA 
QMPCMWWYPFWG 


1280 


2630 


A 


10043 


2 


344 


RATWHNAGKEREAVQLMAGAEKRVKASHS 
I^RGLFGGNTRIEEACENmUAANMFKMAK 
NWSAAGNAFCQAAKLHMQLQSKHDSATSFV 
DAGNAYKKADPQGKTARHVACYLCV 


1281 


2631 


A 


10080 


620 


818 


VIYKLDSSLFSYFIYFFIFETESHFLPLMKWTG 
PIMAHCSLKILASRNSADSAFLSAGDTSLSHST 


1282 


2632 


A 


10084 


3 


1640 


SASinRGDKRASGEVGIAPSSRHILIGEPSAKY 

NGTAUSLVRGPGELGEVTVFWRIFPPSVGEFA 

ETSGKLTMRDEQSAV1WIQALNDDIPEEKSF 

YEFQLTAVSEGGVLSESSSTANITVVASDSPY 

GRFAFSHEQLRVSEAQRVNITIIRSSGDFGHVR 

LWYKTMSGTAEAGLDFVPAAGELLFEAGEM 

RKSLHVEILDDDYPEGPEEFSLTITKVELQGR 

GYDFTIQENGLQIDQPPEIGNISIVRHIMKNDN 

AEGEEFDPKYTAFEVEEDVGLHvUPWRLHGT 
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Amino acid sequence (A=AIanine OCysteine, 
D=Aspartic Acid, &=GIutemic Acid, 
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YGYVTADFISOSSSASPGGVDYILHGSTVTFO 

HGQNLSFINISliDDNESEFEEPIEILLTQATOO 

AVLGRHLVSRIIIAKSDSPFGVIRFLNQSKISIA 

NPNSTMILSLVLERTGGLLGEIQVNWErVGPN 

SQEALLPQNRDIADPVSGLFYFGEGEGGVRTII 

LTIYPHEEIEVEETFIIKLHLVKGEAKLDSRAK 

DVTLHQEFGDPNGWQFAPETLSKKTYSEPL 

AJLEGPLLITFFVRRVKGTFGE1M 


1283 


2633 


A 


10088 


316 


516 


MGSKTLPAPVPIHPSLQLTNYSFLQAVNGLPT 
VPSDHLPNL YGF S ALHAVHLHQ WTLG YP AM 
HLXRS 


1284 


2634 


A 


10091 


2 


569 


FVSPSRAMASALIYVSKFKSFVILVVTPLLLLP 

LVILMPAKFVRCAYVIILK1AIYWCTEVIPLAV 

TSLMPVLLFPLFQILDSRQVCVQYMKDTNML 

FLGGLIVAVAVERWNLHKRIALRTLLWVGA 

KP ARLMLGFMG VTALL SM WISNT ATT AMMV 

PIVEAILQQMEATSAATEAGLELVDKGKAKE 

LP 


1285 


2635 


A 


10092 


290 


728 


KQSTRPDVMTLYPLHWQEEMSGESWSSAVP 
AAATRTTSFKGTSPSSKYVKLNVGGALYYTT 
MQTLTKQDTMLKAMFSGRMEVLTDSEGWIL 
IDRCGKHFGTILNYLRDGAVPLPESRREIEELL 
AEAKYYLVQGLVEECQAALQV 


1286 


2636 


A 


10100 


1 


574 


RPRGRGAWAGPGGDYSGVRRQQRRRTRISGS 

QRGSDAAGTMGCCTGRCSLiaCALQLVSAL 

ERQIFDFLGFQWAPILGNFLHIIVVILGLFGTIQ 

YRPRYIMVYTVWTALWVTWNVFIICFYLEVG 

GLSKDTDLMTFNISVHRSWWREHGPGCVRR 

VLPPSAHGMMDDYTYVSVTGCIVDFQYLEVI 

HSA 


1287 


2637 


A 


10103 


252 


376 


RSRMGDKPIWEQIGSSFIQHYYQLFDNDRTQL 
GAIYVSFQL 


1288 


2638 


A 


10107 


I 


478 


MEEEDESRGKTEESGEDRGDGPPDRDPTLSPS 

AFILRAIQQAVGSSLQGDLPNDKDGSRCHGL 

RWRRCRSPRSEPRSQESGGTDTATVLDMATD 

SFLAGLVSVLDPPDTWVPSRLDLRPGESEDM 

LELVAEVRIGDRDPIPLPVPSLLPRJLRAWRTG 

KT 


1289 


2639 


A 


10113 


237 


438 


LLSRMPSTNRAGSLKDPEIAELFFKEDPEKLFT 
DLREIGHGSFGAAYFARDVRTNEVVAIKKMS 
YSG 


1290 


2640 


A 


10114 


367 


856 


RGAKAKSAVLPPGPPCSSILHSPPAPLTPRSPG " 

TEATRPTAMSKSLKKKSHWTSKVHESVIGRN 

PEGQLGFELKGGAENGQFPYLGEVKPGKVAY 

ESGSKLVSEELLLEVNETPVAGLTIRDVLAVI 

KHCKDPLRLKCVKQGESSGLLSVLPGGGTAR 

GAGQ 


1291 


2641 


A 


10116 


128 


591 


RTBRETERRSALSCSVLKSEPLPGLQPQASQQR 

RRRLPGRRQVQVQEGGGSGLRAWVLAMASV 

LGSGRGSGGLSSQLKCKSKRRRRRRSKRKDK 

VSILSTFLAPFKHLSPGITNTEDDDTLSTSSAE 

VKENRNVGNLAARPPPSGDRARGGATR 


1292 


2642 


A 


10121 


1 


749 


QRRRFRAGLWGGHGLTDGLRRNGGCGCSAR 

VPRVGERLRGHRCPDPLCLLLDMLFLSFHAG 

S WESWCCCCLIPADRPWDRGQI IWQLEMADT 

RSVHETRFEAAVKVIQSLPKNGSFQPTNEMM 

LKFYSFYKQATEGPCKLSRPGFWDPIGRYKW 

DAWS SLGDMTKXEAMIA YVEEMKKIIETMP 

MTEKVEELLRVIGPFYE1VEDKKSGRSSDITSD 
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LGNVLTSTPNAKTVNGKAESSDSGAESEEEE 
AC 


1293 

- 


2643 


A 


10124 


2 


989 


PLMSLVRWEF VAAS SAQKTPSRLENYYMVC 

KADEKFNQLVHFLRNHKQEKHLVFFRYSSGL 

CGRG1RDSARMCSTCACVEYYGKALEVLVK 

GVKIMCniGKMKYKRNKEFMEFRKLQSGlLV 

CTDVMARGIDIPEVNWVLQYDPPSNASAFVH 

RCGRTARIGHGGSALVFLLPMEESYINFLAIN 

QKCPLQEMKPQRNTADLLPKLKSMALADRA 

VFEKGMKAFVSYVQAYAKHECNLIFRLKDL 

DFASLARGFALLRMPKMPELRGKQFPDFVPV 

DVNTDTIPFKDKIREKQRQKLLEQQRREKTEN 

EGRRKFIKNKAWSKQJECAKKK 


1294 


2644 


A 

r 


10129 


91 


1042 


VTMYKDC1ESTGDYFLLCDAEGPWGIILESLA 

ILGIVVTILLLLAFLFLMRKIQDCSQWNVLPTQ 

LLFLLSVLGLFGLAFAFDELNQQTAPVRYFLF 

GVLFALCFSCLLAHASNLVKLVRGCVSFSWT 

TILaAIGCSLLQIIIATEYVTLIMTRGMMFVN 

MTPCQLNVDFVVLLVYVLFLMALTFFVSKAT 

FCGPCENWKQHGRLIF1TVLFSUIWVVWISML 

LRGNPQFQRQPQWDDPWaALVTNAWVFL 

LLYIVPELCCLYRSCRQECPLQGNACPVTAYQ 

HSFQVENQELSRDKWKVLLNSDFLSHSGA 


1295 


2645 


A 


10133 


376 


518 


RPRWTHNSQWCFLPQDHPGWLPG QSGAPG 
GRGAPRQEGPGSSWRQV 


1296 


2646 


A 


10135 


3 


551 


EWSLDPFMGIMS GQVGDLSPSQEKSLAQFRE 

NIQDVLSALPNPDDYFLLRWLQARSFDLQKS 

EDMHIKHMEFRKQQDLANILAWQPPEVVRL 

YNANGICGHD GEGSPVWYMVGSQDPKGLLL 

SASKQELLRDSFRSCELLLRECELQSQKLGKR 

VEKHAIFGLEGLGLRDLWKPGIELLQE 


1297 


2647 


A 


10138 


48 


407 


MVSSCCGSVCSDQGCGQDLCQETCCRPSCCE 
TTCCRTTCCRPSCCVSSCCRPQCCQSVCCQPT 
CSRPSCCQTTCCRTTCYRPSCCVSSCCRPQCC 
QPVCCQPTCCRPSCCETTCCHPXCC 


1298 


2648 


A 


10156 


94 


453 


GGNRKSAEMFSQVPRTPASGCYYLNSMTPEG 
QEMYLRFDQTTRRSPYRMSRILARHQLVTKI 
QQEIEAKEACDWLRAAGFPQYAQLYEDSQFP 
INIVAVKNDHDFLEKDLGEPLCRRLNT 


1299 


2649 


A 


10161 


1 


393 


PRFSELVDGRGRVSARFGGSPSKAATVRSQPT 
ASAQLENMEEAPKRVSLALQLPEHGSKD1GN 
VPGNGSENPCQNGGTCVPGADAHSCDCGPGF 
KGRRCELACIKVSRPCTRLFSETKAFPVWEGG 
VGHHV 


1300 


2650 


A 


10162 


98 


391 


AKIASLERIMPANYTCTRPDGDNTDFRYFIYA 
VTYTGELGPGLIGNILALWVFYGYMKETKRA 
VIFMINLAIADLLQVLSLPLRIFYYLKHDWPF 
VPV 


1301 


2651 


A 


10165 


1 


7545 


PGIRVGITSQTGLSSNLQENCSKLAFISSHGTE 

KQLQCMPMEGRGRASSSISDLQGKGFEKGTG 

EKHVPGV GSARHSPQ ASAGGSP WQRGKAQT 

RWLGKPDPGRKRRRGSPQEEGGLRVSAAAR 

LLCSGANRCKVLVRQNSTPNTQQPAVHPSTP 

PSRPLPQAGRCLVAPLRPHPDWVAAKTLAKA 

LRAPGKPWRLAAPSPLGDLGAPGLPGPSTAP 

RTLSVEEPGVECNQLCLYADVTDPVLCLGQK 

DPGVEGKHCEKEKISSSKELKHVHAKSEPSKP 

ARRLSESLHVVDENKNESKIEREHKRRTSTPV 

IMEGVQEETDTRDVKRQVERSEICTEEPQKQ 
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KSTLKNEKHIJKKDDSE'IPHLKSLLKKEVKSS 

KEKPEREKTPSEDKLSVKHKYKODCMHKTO 

DETELHSSEKOLKVEENIQKQSQQTKLSSDDK 

TERKSKHRNERKLSVLGKDGKPVSEYIIKTDE 

NVRJCENNKJCERIU,SAEKTKAEHKSRRSSDSK 

IQKDSLGSKQHGITLQRRSESYSEDKCDMDST 

NMDSNLKPEEVVHKEKMITKSLLEEKLVLKS 

KSKTQGKQVKWETELQEGATKQATTPKPD 

KEKNTEENDSEKQRKSKVEDKPFEETGYEPV 

LETASSSAHSTQKDSSHRAKLPLAKEKYKSD 

KDSTSTRLERKLSDGHKSRSLKHSSKDIKKKD 

ENKSDDKDGKEVDSSHEKARGNSSLMEKKL 

SRRLCENRRGSLSQEMAKGEEKLAANTLSTP 

SGSSLQRPKKSGDMTLDPEQEPMEIDSEPGYE 

NVFEVSKTQDNRNNNSHQDIDSENMKQKTS 

ATVQKDELRTCTADSKATAPAYKPGRGTGV 

NSNSEKHADHRSTLTKXMHIQSAVSKMNPGE 

KEPIHRGTTEVNIDSETVHRMLLSAPSENDRV 

QKNLKNT.^AEEHVAQGD ATLEHSTNLDS SPS 

LSSVTVVPLRESYDPDVIPLFDKRTVLEGSTA 

STSPADHSALPNQSLTVRESEVLKTSDSKEGG 

EGFTVDTPAKASITSKRHIPEAHQATLLDGKQ 

GKVIMPLGSKLTGVIVENENITKEGGLVDMA 

KKENDLNAEPNLKQTKATVENGKKDGIAVD 

HWGLNTEKYAETVKLKHKRSPGKVKDISID 

VERRNENSEVDTSAGSGSAPSVLHQRNGQTE 

DVATGPRRAEKTSVATSTEGKDKDVTLSPVK 

AGPATTTSSETRQSEVALPCTSIEADEGLIIGT 

HSRNNPLHVGAEASECTVFAAAEEGGAVVTE 

GFAESETFLTSTKEGESGECAVAESEDRAADL 

LAVHAVK1EANVNSVV1EEKDDAVTSAGSEE 

KCDGSLSRDSEIVEGTITFISEVESDGAVTSAG 

TEIRAGSIS SEEVDGSQGNMMRMGPKKETEG 

TVTCTGAEGRSDNFVICSVTGAGPREERMVT 

GAGWLGDNDAPPGTSASQEGDGSVNDGTE 

GESAVTSTGITEDGEGPASCTGSEDSSEGFAIS 

SESEENGESAMDSTVAKEGTNVPLVAAGPCD 

DEG1VTSTGAKEEDEEGEDWTSTGRGNEIGH 

ASTCTGLOEESEGVLICESAEGDSQIGTWEH 

VEAE AGAAIMNANENNYD SMSGTEK.GSKDT 

DICSSAKGM&SVTSAVSGKDEVTPVPGGCE 

GPMTSAASDQSDSQLEKVEDTTTSTGLVGGS 

YDVLVSGEVPECEVAHTSPSEKEDEDIITSVE 

NEECDGLMATTASGDITNQNSLAGGKNQGK 

VLIISTSTTNDYTPQVSAITDVEGGLSDALRTE 

ENMEGTRVTTEEFEAPMPSAVSGDDSQLTAS 

RSEEKDECAMISTSIGEEFELPISSATTIKCAES 

LQPVAAAVEERATGPVLISTADFEGPMPSAPP 

EAESPLASTSKEEKDECALISTSIAEECEASVS 

uVV VhbbNliRAG I VMlxJsJUvjbOJJbl oo YEIX, 

EGPVSSAVPQEEGDPSVTPAEEMGDTAMISTS 

TSEGCEAVMIGAVLQDEDRLTITRVEDLSDA 

A1ISTSTAECMPISASIDRHEENQLTADNPEGN 

GDLSATEVSKHKVPMPSLIAENNCRCPGPVR 

GGKEPGPVLAVSTEEGHNGPSVHKPSAGQGH 

PSAVCAEKEEKHGKECPE1GPFAGRGQKESTL 

HLINAEEKNVLLNSLQKEDKSPETGTAGGSST 

ASYSAGRGLEGNANSPAHLRGPEQTSGQTAK 

DSSVSSERYLAAVNTGAIKADDMPPVQGTVA 

EHSFLPAEQQGSEDNLKTSTTKCITGQESKIAP 
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SHTMIPPATYSVALLAPKCEQDLTONDYSGK 
WTDQASAEKTGDDNSTRKSFPEEODIMVTVS 
SEENVCDIGNEE SPLNVLGGLKLKANLKMEA 
YVPSEEEKNGEILAPPESLCGGKPSGIAELQRE 
PLLVNESLNVENSGFRTNEEIHSE$YNKGEISS 
GRKDNAEAISGHSVEADPKEVEEEERHMPKR 
KRKQHYLSSEDEPDDNPDVLDSRIETAQRQC 
PETEPHATKEENSRDLEELPKTSSETNSTTSRV 
MEEKDEYSSSETTGEKPEQNDDDT1KSQE 


1302 


2652 


A 


10167 


321 


842 


EPSLFPFLRPSPARPPPRPPAPFPSPELAGPEPH 

FVFYFFLSYVHPPKELAKYEYMEEQVILTEKG 

NSTVAGRGTSVRCLSPSPRPLPPLLPLLADLLE 

DGFGEHPFYHCL VAE VPKEHWTPEGNPSPFP 

EARETKCYVRSSVGCVEPLTTQAEVTENLDR 

KNSQQVFKLLKKK 


1303 


2653 


A 


10171 


206 


429 


NMILLKKRRLLINSLGEGTINGLLDELLETNV 
LSQEDTEIVKCENVTVIDKARDLLDSV1RKGA 
RACEICITYI 


1304 


2654 


A 


10184 


970 


1524 


LCTLSPGISGTAGSCLTTEPGTELGTSFAQNGF 
YHEAVVLFTQALKLNPQDHRLFGNRSFCHER 
LGQPAWALADAQVALTLRPGWPRGLFRLGK 
ALMGLQRFREAAAVFQETLRGGSQPDAAREL 
RSCLLHLTLQGQRGGICAPPLSPGALQPLPHA 
ELAPSGLPSLRCPRSTALRSPGLSPLLH 


1305 


2655 


A 


10194 


2 


394 


TDLLGRRFRVDGAAMAACEGRRSGALGSSQ 

SDFLTPPVGGAPWAVATTWMYPPPPPPPHR 

DnSVTLSFGESYDNSKSWRRRSCWRKWKQL 

SRLQRNMILFLLAFLLFCGLLFYTNLADHWKG 

IRNTCT 


1306 


2656 


A 


I0I95 


1 


410 


IPGSTISLEGPLSKWTNVMKGWQYRWFVLDY 

NAGLLSYYTSKDKMMRGSRRGCVRLRGAV1 

GIDDEDDSTFTJTVDQKTFHFQARDADEREK 

WIHALEETILRHTLQLQVRVFTWFPDSSLVGA 

FFFWLVSGFFFK 


1307 


2657 


A 


10205 


85 


308 


QGLFSTMVKLGCSFSGKPGKDPGDQDGAAM 
DSVPLISPLDISQLQPPLPDQWIKTQTEYQLS 
SPDQQNYTKSR 


1308 


2658 


A 


10214 


2 


453 


ECGGIRQPGPGPPPALASAPAATMNRVGGSPS 
AAANYLLCTNCRKVLRKDKRIRVSQPLTRGP 
SAFIPEKEVVQANTVDERTNFLVEEYSTSGRL 
DN1TQVMSLHTQYLESFLRSQFYMLRMDGPL 
PLPYRHYIAIMAAARHQCSYLINM 


1309 


2659 


A 


10233 


45 


421 


RGWPEQQSTGRPRDVARQPRCQKEEGRRLRP 
RALESRITQGSERSRWGPPLESTKENVQCGH 
RPAFPNS S WLPFHERLQVQNGECPWQVSIQM 
SRKHLCGGSILHWWWVLTAAHCFRRTLLDM 
AV 


1310 


2660 


A 


10241 


243 


442 


AFQLFNAKCESAFLSKRNPLQRNWTVLYRRK 
HKKGQSAErQKKRTRRAFKFQRATTGASLADI 
MAK 


1311 


2661 


A 


10261 


751 


116 


LPGADYGGGHLSLRLFHLLLTSAAWVPDESQ 

VTLNSAICVLSTVLIMEFPDLGKHCSEKTCKQ 

LDFLPVKCDACKQDFCKDHFPYAAHKCPFAF 

QKDVHVPVCPLOOTIPVKKGQIPDVVVGDHI 

DRDCDSHPGKKKEKJFTTOCSKEGCKKKEML 

QMVCAQCHGNFC1QHRHPLDHSCRHGSRPTI 

KAG 


1312 


2662 


A 


10270 


3 


669 


STSSDEGSPSA STPMINKTGFKFSAEKP VIEVP 
SMTILDKKD GEQAKALFEK VRKFRAHVEDSD 
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Y=T>rosine, X=Unknown, »=Stop codon, 
/=possib!e nucleotide deletion, possible 
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LIYKLYWOTVIKTAKFIFILCYTANFVNA1SF 
EHVClO^KV^GYEVFECmhMAYMLKKL 
LISYISIICYYGFICLYTLFWLFRIPLKEYSFEKV 
REES SF SDIPD VKNDFAFLLHMVDQ YDQL Y S 
KRFGVFLSEVSENKLREISLNHEWTFEKL 


1313 


2663 


A 


10287 


1221 


266 


gahrvlspaqgaqprlrsaasvevsmvgqr 

vlllvafllsgvllseaakilhstlggshyll 

ldrvsqclqehghnvtmlhqsgkflipdikee 

eksyqvirwfspedhqkrikkhfdsyietald 

grkesealvklme3fgtqcsyllsrkdimdsl 

knenydlvfveafdfcsfliaeklvkpfvail 

pttfgsldfglpsplsyvpvfpslltdhmdfw 

grvknflmffsfsrsqwdmqstfdntikehf 

pegsrpvlshlllkaelwfvnsdcafdfarpl 

lpntvyigglmekpikpvpqvsepsafslgft 


1314 


2664 


A 


10288 


536 


1890 


NVQLAKFS STL VFFFSCD ADPS AL AKY VL AL 

VKKDKSEKELKALCIDQLDVFLQKETQIFVEK 

LFDAVNTKSYLPPPEQPS SGSLKVEFFPPQEK. 

DIKKEEITKEEEREKKFSRRLNHSPPQSS SRYR 

ENRSRDERKKDDRSRfCRDYDRNPPRRDSYRD 

RYNRRRGRSRSYSRSRSRSWSKERLRERDRD 

RSRTRSRSRTRSRERDLVKPKYDLDRTDPLEN 

NYTPVSSVPSISSGHYPVPTLSST1TVIAPTHHG 

NNTTESWSEFHEDQVDHNSYVRPPMPKKRC 

RDYDEKGFCMRGDMCPFDHGSDPWVEDVN 

LPGMQPFPAQPPWEGPPPPGLPPPPPILTTPPV 

NLRPPVPPPGPLPPSLPPVTGPPPPLPPLQPSG 

MDAPPNSATSSVPTVVTTGIHHQPPPAPPSLFT 

ADTYDTD G YNPE AP S ITNTSRPMYRHRVHPR 

AKLG 


1315 


2665 


A 


10293 


447 


1331 


SHPLLSCPEKVSAKLRAAAEAAAEERRTRGA 

GSRGICAGLRSVAPGPEPLKQEEGRREWCjSSI 

GTPSPCGSAQAAAAAAAEEATEKIPALRPALL 

WALLALWLCCATPAHALQCRDGYEPCVNEG 

MCVTYHNGTGYCKCPEGFLGEYCQHRDPCE 

KNRCQNGGTCVAQAMLGKATCRCASGFTGE 

DCQYSTSHPCFVSRPCLNGGTCHMLSRDTYE 

CTCQVGFTGKNPKCPG GNLNYQFNGIIWYS 

GGSVPPSGTKTSKPAEHNAMGTGSKNFASGT 

LWVMVSGATSTSTSTL 


1316 


2666 


A 


10294 


118 


572 


SLSMESNHKSGDGLSGTQKEAALRALVQRTG 

YSLVQENGQRKYGGPPPGWDAAPPERGCEIFI 

GKLPRDLFEDELPLCEKIGKTYEMRMMMDF 

NGNNRGYAFVTFSNKVEAKNAIKQLNNYE1R 

NGRLLGVCASVDNCRLFVGGIPKTKK 


1317 


2667 


A 


10301 


158 


1956 


LLKSCGmSGVCIPCEGKGPTVLVIQTAVPQ 

DRPTKSSMRSAAKPWNPAIRAGGHGPDRVRP 

LPAASSGMKSSKSSTSLAFESRLSRLKRASSE 

DTLNKPGSTAASGVVRJLKKTATAGAISELTES 

RLRSGTGAFTTTKRTGIPAPREFSVTVSRERSV 

PRGPSNPWCSVSSPTSSNTPTPTKHLRTPSTKP 

KQENEGGEKAALESQVRELLAEAKAKDSEIN 

RLRSELKKYKEKRTLNAEGTDALGPNVDGTS 

VSPGDTEPMIRALEEKNKNFQKELSDLEEENR 

VLKEKLlYLEHSPNSKGAASHTGDSSCPTSri'Q 

ESSFGSPTGNQLSSD1DEYKKNIHGNALRTSG 

SSSSDVTKASLSPDASDFEHITAETPSRPLSSTS 

NPFKS SKCSTAGSSPNS VSELSLASLTEKIQKM 

EENHHSTAEELQATLQELSDQQQMVQELTAJE 
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nucleotide 
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ng to first 

amino acid 
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Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue 
of peptide 
sequence 


Amino acid sequence (A«AIanine OCysteine, 
D=Aspartic Acid, E^Glutamic Acid, 
F=Phenylalanine, G=Glycine, H=Histidine, 
Msoleucine, K=JLysine, L^Leucine, 
M^Methionine, N=Asparagine, P=ProIine, 
Q=Glutamine, R=Arginine, S=Serine> 
T=Threonine, V=Valine, W-Tryptophan, 
Y=Tyrosine, X=Unknown, *=Stop codon, 
^=possible nucleotide deletion, V=possible 
nucleotide insertion 














NEKLVDEKT1LETSFHQHRERAEQLSQENEKL 
MNLLQERVKNEEPTTQEGKIIELEQKCTGILE 
QGRFEREKLLNIQQQLTC SLRKVEEENQG AL 
EMIKRLKEENEKLNEFLELERHNNNMNlAKrL 
EECRVTLBGLKMENGSLKSHLQG 


1318 


2668 


A 


10303 


333 


879 


GECFIMAAWQQNDLVFEFASNVMEDERQL 

GDPAIFPAVIVEHVPGADILNSYAGLACVEEP 

NDMITESSLDVAEEEIIDDDDDDITLTVEASCH 

IXjDETTETIEAAEALLNMDSPGPMLDEKRINN 

NIFSSPEDDMVVAPVTHVSVTLDGIPEVMETQ 

QVQEKYADSPGASSPEQPKRKKK 


1319 


2669 


A 


10322 


169 


654 


KffiVRMSGSVAVTRAIAVPGLLLLLIIATAI.SL 

LIGAKSLPASWLEAFSGTCQSADCTIVLDAR 

LPRTLAGLLAGGALGLAGALMQTLTRNPLAD 

PGLLGVNAGASFAIVLGAALFGYSSAQEQLA 

MAFAGALVASLIVAFTGSQGGGQLSPVRLTL 

AGVXL 


1320 


2670 


A 


10323 


441 


2 


KMNQVAWIGGGQTLGAFLCHGLAAEGYRV 

AWDIQSDKAANVAQEINAEYGESMAYGFG 

ADATSEQSVLALSRGVDEIFGRVDLLVYSAG1 

AKAAFISDFQLGDFDRSLQVNLVGYFLCARE 

FSRLMERDGIQGRIIQINSKSDE 


1321 


2671 


A 


10332 


1 


453 


RHRTAGPGSTIS SRTDSAS APAARAMPCE YT Y 

AKLTSDCSRPSLQWYTRAQSKMRRPRLLLKD 

ILKCTLLVFGVRILYILKLNYTTEECDMKNMH 

YVDPDHVKRAQKYAQQVLQKESPPKFAKTS 

MALLFEHRYSVDLLPFVQKAPTDSEA 


1322 


2672 


A 


10333 


25 


423 


EPSNGPWYSALGNEDDEILLLGKDIIGTFAAS 

ERKMRAHQVLTFLLLFVITSGASENASTSRGC 

GLDLLPQNVYLCDLDAIWGIWEAVAGAGA 

LITLLLMLILLGRLPFIKEKEKKSPAVLHFLFL 

LGTLG 


1323 


2673 


A 


10334 


52 


426 


SSLGNEDDEILSLAKDITGMFVASHRKMRAH 
QVLTFLLLFVrrSVASENASTSRGCGLDLLPQ 
YVSLCDLDAIWGIWEAAAGAGALITLLLMLI 
LLVRLPFFKEKEKKSPVGLHFLFLLGTLGP 


1324 


2674 


A 

> 


10336 


1 


932 


ERLCFPCMQSKIYSYMSPNKCSGMRFPLQEE 

NSVTHHEVKCQGKPLAGIYRKREEKRNAGN 

AVRSAMKSEEQKIKDARKGPLVPFPNQKSEA 

AEPPKTPPSSCDSTNAAIAKQALKKPIKGKQA 

PRKKAQGKTQ QNRKLTDFYP VRRS SRKSKAE 

LQSEERKRIDELIESGKEEGMKIDLIDGKGRG 

VXATKQFSRGDFVVEYHGDLEEITDAKKREAL 

YAQDPSTGCYMYYFQYLSKTYCVDATRETN 

RLGRLINHSKCGNCQTKLHDIDGVPHLILIAS 

RDIAAGEELLYDYGDRSKASIEAHPWLKH 


1325 


2675 


A 


10338 


3 


870 


PGSTISCSELKGTQCRATAGSRGRRPPMTCWL 

RGVTATFGRPAEWPGYLSHLCGRSAAMDLG 

PMRKSYRGDREAFEETHLTSLDPVKQFAAWF 

EEAVQCPDIGEANAMCLATCTRDGKPSARML 

LLKGFGKDGFRFFTNFESRKGKELDSNPFASL 

VFYWEPLNRQVRVEGPVKKLPEEEAECYFHS 

RPKSSQIGAWSHQSSVIPDREYLRKKNEELE 

QLYQDQEVPKPKSWGGYVLYPQVMEFWQG 

QTNRLHDRIVFRRGLPTGDSPLGPMTHRGEE 

DWLYERLAP 


1326 


2676 


A 


10344 


2 


984 


ARAAAHCGICRLVRWWRKRRSVMGIQTSPV 
LLASLGVGLVTLLGLAVGSYLVRRSRRPQVT 
LLDPNEKYLLRLLDKTT V SHNTKRFRF ALPTA 
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Amino acid sequence (A-Alaninc OCysteine, 
D-Aspartic Acid, E=01uLamic Acid, 
F=Phenylalanine, GKHycine, H=Histidine, 
Hsoleuclne, K=Lysioe, L=Leucine, 
M=Methionine, N=Asparagine, P=Proline, 
Q=Glutamine, R-Arginine, S=Serine, 
T-Threonine, V-Valinc, W-Tryptophan, 
Y«Tyrosinc, X=Unknown, ♦'"Stop codon, 
/=possible nucleotide deletion, V=possible 
nucleotide insertion 














HHTLGLPVGKH1YLSTRIDGSLVTRPYTPVTSD 

EDQGYVDLVIKVYLKGVHPKFPEGGKMSQY 

LDSLKVGDWEFRGPSGLLTYTGKGHFNIQP 

NKKSPPEPRVAKKLGMIAOGTGrrPMLQLIRA 

ILKVPEDPTXJCFLLFANQTEKDULREDLEELQ 

ARYPNRFKLWFTLDHPPKDWAYSKGFVTAD 

MIREHLPAPGDDVLVLLCGPPPMVQLACHPN 

LDKLGYSQKMRFTY 


1327 


2677 


A 


10345 


1 


968 


LQSAGEOVTHVLILLESPARPVAAVTQVQRR 

RYHRLSDMSMLAERRRKQKWAVDPQNTAW 

SNDDSKFGQRMLEKMGWSKGKGLGAQEQG 

ATDHIKVQVKNNHLGLGATINNEDNWIAHQ 

DDFNQLLAELNTCHGQETTDSSDKKEKKSFS 

I^EKSKISKNRVIiYNQCFTKGBCDLSSRSKTDL 

DCIFGKRQSKKTPEGDASPSTPEENEnTTSAF 

TIQEYFAKRMAALKNKPQVPVPGSDISETQVE 

RKRGKKRNKEATGKDVESYLQPKAKRHTEG 

KPERAEAQERVAKKKSAPAEEQLRGPCWDQ 

SSKASAQDAGDHVQPA 


1328 


2678 


A 


10346 


173 


439 


GSAAMKVKIKCWNGVATWLWVANDENCGI 
CRMAFNGCCPDCKVPGDDCPLVWGQCSHCF 
HMHCILKWLHAQQVQQHCPMCRQEWKFKE 


1329 


2679 


A 


10351 


3 


964 


QMEPGNDTQISEFLLLGFS QEPGLQPFLFGLFL 

SMYLVTVLGNLLULATISDSHLHTPMYFFLSN 

LSFADICVTSTTIPKMLMNIQTQNKVITYIACL 

MQMYFFILFAGFENFLLSVMAYDRFVAICHP 

LHYMVIMNPHLCGLLVLASWTMSALYSLLQI 

LMVVRLSFCTALEIPHFFCELNQVIQLACSDSF 

LNHMV1YFTVALLGGGPLTGILYSYSKIISSIH 

AISSAQGKYKAFSTCASHLSWSLFYGAELGV 

YLSSAATRNSHSSATASVMYTWTPMLNPF1 

YSLRNKDKRALGrHLLWGTMKGQFFKKCP 


1330 


2680 


A 


10352 


34 


2573 


1PFLKSCCCCCXFDFPPPPLDQVQEEECEVERV 

TEHGTPKPFRKFDSVAFGESQSEDEQFENDLE 

TDPPNWQQLVSREVLLGLKPCEIKRQEVINEL 

FYTERAHVRTLKVLDQVFYQRVSREG1LSPSE 

LRKIFSNLEDILQLHIGLNEQMKAVRKRNETS 

VIDQIGEDLLTWFSGPGEEKLKHAAATFCSNQ 

PFALEMIKSRQKKDSRFQTFVQDAESNPLCRR 

LQLKDI1PTQMQRLTKYPLLLDNIATYTEWPT 

EREKVKKAADHCRQILNYVNQAVKEAENKQ 

RLEDYQRRLDTSSLKLSEYPNVEELRNLDLTK 

RKMIHEGPLVWKVNRDKTIDLYTLLLEDILV 

LLQKQDDRLVLRCHSKELASTADSKHTFSPVI 

Kl^TVLVRQVATDNKALFVISMSDNGAQIYE 

LVAQTVSEKTVWQDUCRMAASVKEQSTKPI 

PLPQSTPGEGDNDEEDPSKLKEEQHGISYTGL 

QSPDRDLGLESTLISSKPQSHSLSTSGKSEVRD 

LFVAERQrAKEQrniXjTLKEVGEDYQIAff 

HLPVSEERWALDALRNLGLLKQLLVQQLGLT 

EKSVQEDWQHFPRYRTASQGPQTDSVIQNSE 

NDCAYHSGEGHMPFRTGTGDIATCYSPRTSTE 

SFAPRDSVGLAPQDSQASNILVMDHMIMTPE 

MPTMEPEGGLDDSGEHFFDAREAHSDENPSE 

GDGAVNKEEKDVNLRISGNYLILDGYDPVQE 

SSTDEEVASSLTLQPMTGIPAVBSTHQQQHSP 

QNTHSDGAISPFTPEFLVQQRWGAMEYSCFEI 

QSPSSCADSQSQIMEY1HKIEADLEHLKKVEE 

SYTLLCQRLAGSALTDKHSDKS 
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, Amino acid sequence (A=AJanine OCysteine, 
D=Aspartic Acid, E=<jlutamic Acid, 
^Phenylalanine, G=Glycine, H^Histidine, 
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T-Threonine, V=Valine, W^Tryptophan, 
Y-Tyrosine, X=Unknown, *=Stop codon, 
^possible nucleotide deletion, \r=possible 
nucleotide insertion 


1331 


2681 


A 


10353 


1 


2100 


AVEFAEGALTMAPWPELGDAQPNPDKYLEG 

AAOOQPTAPDKSKETNKTDNTEAPVTKIBLLP 

SYSTATLIDEPTEVDDPWNLPTLQDSGIKWSE 

RDTKGKDLCFFQGIGRLILLLGFLYFFVCSLDIL 

SSAFQLVGGKMAGQFFSNSSIMSNPLLGLVIG 

VLVTVLVQSSSTSTSIVVSMVSSSLLTVRAAIP 

IIMGAN1GTSITNTIVALMQVGDRSEFRRAFA 

QATVHDFFNWLSVLVLLPVEVATHYLEIITQL 

IVESFHFKNGEDAPDLLKVITKPFTKUVQLDK 

KVISQIAMNDEKAKNKSLVKIWCKTFTNKTQ 

INVTVPSTANCI^PSLCWTIXjIQNWTMKNVT 

YKENIAKCOHIFVNFHI PDLAVGTILT IT <?T T V 

LCGCLIMI VKELG S VLKGQ VATVIKKTINTDFP 

FPFAWLTGYLAILVGAGMTFIVQSSSVFTSAL 

TPLIGIGVITffiRAYPLTLGSNIGTTTtAILAAL 

ASPGNALRSSLQIALCHFFFNISGtLLWYPIPFT 

RJLPIRMAKGLGNISAKYRWFAVFYLIIFFFLIP 

LTVFGLSLAGWRVLVGVGVPVVFniLVLCLR 

LLQSRCPRVLPKKLQN^WLPLWMRSLKPW 

DAWSKPTGCFQMRCCC(XRVCCRACCLLC 

GCPKCCRCSKCCEDJLEEAQEGQDVPVKAPET 

FDNIT1SREAQGEVPASDSKTECTAL 


1332 

• 


2682 


A 


10354 


30 


1377 


SQOGSQPHRQGPPSLLTAPHSLDLPALPPGPR 

GSQGKLRRVLVPMSVKPSWGPGPSEGVTAVP 

TSDLGEIHNWTELLDLFNHTLSECHVELSQST 

KR WLFAL YL AMFW GL VENLL VIC VN WRG 

SGRAGL^4NLYILNMAIADLGIVLSLPVWMLE 

VTLDYTWLWGSFSCRFTHYFYFVNMYSSIFF 

LVCLSVDRYVTLTSASPSWQRYQHRVRRAM 

CAGIWVLSAIIPLPEWfflQLVEGPEPMeLFM 

APFETYSTWALAVALSTTILGI^LPFPLITVFN 

VLTACRLRQPGQPKSRRHCLLLCAYVAVFV 

MCWLPYHVTLLLLTUIGTHISLHCHLVHLLY 

FFYDVirXFSMLHCVINPILYNFLSPHFRGRLL 

NAVVHYLPKDQTKAGTCASSSSCSTQHSIIIT 

KGDSQPAAAAPHPEPSLSFQAHHLLPNTSPISP 

TQPLTPS 


1333 


2683 


A 


10358 


2 


884 


AAGAGADGREPASERASRAEPPAVAMGQND 

LMGTAEDFADQFLRVTKQYLPHVARLCLIST 

FLEDGIRMWFQWSEQRDYIDTTWNCGYLLA 

SSFVFLNLLGQLTGCVLVLSRNFVQYACFGLF 

GI1ALQTIAYSILWDLKFLMRNLALGGGLLLL 

LAESRSEGKSMFAGVPTMRESSPKQYMQLGG 

RVLLVLMFMTLLHFDASPFSIVQNT/GTALMI 

LVAIGFKTKLAALTLVVWLFAINVYFNAFWT 

1PVYKPMHDFLKYDFFQTMSVIGGLXLWAL 

GPGGVSMDEKKKEW 


1334 


2684 


A 


10367 


59 


1562 


QAWSLQVALSPFFFPASPSNSFAAAVPQLLFP 

ELPLPHVPGQESAKRRSARRFLIMSELTKELM 

EL VWGTKSSPGLSDTDFCRWTQGF VFSESEG S 

ALEQFEGGPCAVIAPVQAFLLKKLLFSSEK.SS 

WRDCSQEEQKELLCHTLCDILESACCDHSGS 

YCLVSWLRGKTTEETASISGSPAESSCQVEHS 

SALAVEELGFERFHALIQKRSFRSLPELKDAV 

LDQYSMWGNKFGVLLFLYSVLLTKGIEMKN 

EIEDASEPLIDPVYGHGSQSLINLLLTGHAVSN 

VWDGDRECSGMKLLGIHEQAAVGFLTLMEA 

LRYCKVGSYLKISKJPYLDCLASETHLTVFFA 

KDMALVAPEAPSEQARRVFQTYDPEDNGFIP 

DS1XEDVMKA1J)LVSDPEYIN1MKNKLDPEG 
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nucleotide insertion 














LGIILLGPFLQEFFPDQGSSGPESFTVYHYNGL 
DTPDCRCLQTKWPYIH.LW7TDRSPSLN 


1335 


2685 


A 


10375 


82 


2929 


TRTKRRLGREKAMASPPRGWGCGELLLPFML 

LGTLCEPGSGQIRYSMPEEU)KGSFVGNIAKD 

LGLEPQELAERGVR1VSRGRTQLFALNPRSGS 

LVTAGRIDREELCAQSPLCVVNFNILVENKM 

KIYGVEVEHDINDNFPRFRJDEELKVKVNENA 

AAGTRLVLPFARDADVGVNSLRSYQLSSNLH 

FSLDVVSGTDGQKYPELVLEQPLDREKETVH 

DLLLTALDGGDPVLSGTTHIRVTVLDANDNA 

PLFTPSEYSVSVPENIPVGTRLLMLTATDPDE 

G1NGKLT Y SFRMEEEKISE'l'FQLDSNLGEISTL 

QSLDYEESRFYLMEWAQDGGALVASAKW 

VTVQDVNDNAPEVILTSLTSSISEDCLPGTVIA 

LFSVHDGDSGENGEIACS1PRNLPFKLEKSVD 

NYYHLLTTRDLDREETSDYNITLTVMDHGTP 

PI..STESHIPLKVADVNDNPPNFPQASYSTSVT 

ENNPRGVSIFSVTAHDPDSGDNARVTYSLAE 

DTFQGAPLSSYVSINSDTGVLYAtRSFDYEQL 

RDLQLWVTASDSGNPPLSSNVSLSLFVLDQN 

DNTPEILYPALPTDGSTGVELAPRSAEPGYLV 

TKVYAVDKDSGQNAWLSYRLLKASEPGLFA 

VGLHTGEVRTARALLDRDALKQSLVVAVED 

HGQPPLSATFTVTVAVADRIPDILADLGSIKTP 

IDPEDLDLTLYLVVAVAAVSCVFLAFVIVLLV 

LRLRRWHKSRLLQAEGSRLAGVPASHFVGV 

DGVRAFLQTYSHEVSLTADSRKSHLIFPQPNY 

ADTLLSEESCEKSEPLLMSDKVDANKEERRV 

QQAPPNTDWRFSQAQRPGTSGSQNGDDTGT 

WPNNQFDTEMLQAMILASASEAADGSSTLGG 

GAGTMGLSARYGPQFTLQHVLQGELGSDYR 

QNVYIPGSNATLTNAAGKRDGKAPAGGNGN 

KKKSGKKEKK 


1336 


2686 


A 


10379 


1 


557 


RPRRRQPSFSCRVLVLEDPPCFRFTNSMNQEK 
LAKLQAQVRIGGKGTARRKKKVVHRTATAD 
DKKLQSSLKKLAVNNIAGIEEVNMIKDDGTVI 
HF>M»KVQASLSANTFAITGHAEAKPITEMLP 
GILSQLGADSLTSLRKLAEQFPRQVLDSKAPK 
PEDIDEEDDDVPDLVENFDEASKNEAN 


1337 


2687 


A 


10380 


1 


1263 


IPGSTISWSPAAARGLSVCRCCRLHPASAMDL 
FGDLPEPERSPRPAAGKEAQKGPLLFDDLPPA 
SSTDSGSGGPLLFDDLPPASSGDSGSLATSISQ 
M VK I JiuKuAKKKl ScEEKNGbEELVEKKVC 
KASSVIFGLKGYVAERKGEREEMQDAHVILN 
D1TEECRPPSSLITRVSYFAVFDGHGGIRASKF 
MQNLHQNLIRKFPKODVISVEKTVKRCIXD 
TFKHTDEEFLKQASSQKPAWKDGSTATCVLA 
VDNILYIANLGDSRAILCRYNEESQKHAALSL 
SKEHNPTOYEERMRIOKAGGNVRDGRVT GV 
LEVSRSIGDGQYKRCGVTSVPDIRRCQLTPND 
RFILLACDGLFKVFTPEEAVNFILSCLEDEK1Q 
TREGKSAADARYEAACNRLANKAVQRGSAD 
NVTVMWRIGH 


T338 


2688 


A 


10385 


3 


589 


GPSQSMAAGELEGGKPLSGLLNALAQDTFHG 
YPGITEELLRSQLYPEVPPEEFRPFLAKMRGIL 
KS1ASADMDFNQLEAFLTAQTKKQGGITSDQ 
AAV1SKFWKSHKTKIRESLMNQSRWNSGLRG 
LSWRVDGKSQSRHSAQIHTPVAIIELELGKYG 
QESEFLCLEFDEVKVNQILXTLSEVEESISTLIS 
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Amino acid sequence (A=Alanine OCysteine, 
D=Aspartic Acid, E=Glutamic Acid, 
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Q=Glutamine, R=Arginine, S=Serine, 
T=Threonine, V=Valine, W=Tryptophan, 
Y=Tyrosine, X=Unknown, *=Stop codon, 
/^possible nucleotide deletion, V=possible 
nucleotide insertion 














QPN 


1339 


2689 


A 


10386 


50 


390 


LGAMAkHHPDLtfCRKQA^VAlGRLCEKCDG 
KCVICDSYVRPCTLVRICDECNYGSYQGRCVI 
CGGPGVSDAYYCKECI1QEKDRDGCPKIVNL 
GSSKTDLFYERKKYGFKKR 


1340 


2690 


A 


10388 


113 


3472 


SQLRKGASATHS SPSRTDCIAQMMDI YVCLK 

RPSWMVDNKRMRTASNFQWLLSTFILLYLM 

NQVNSQKKGAPHDLKCVTNNLQVWNCSWK 

APSGTGRGTDYEVCIENRSRSCY QLEKTSIKIP 

ALSHGDYEITINSLHDFGSSTSKFTLNEQNVSL 

IPDTPEILNLSADFSTSTLYLKWNDRGSVKPHR 

SNVIWEIKVLRKESMELVXLVTHNTTLNGKD 

TLHHWSWASDMPLECAIHFVEIRCYIDNLHFS 

GLEEWSDWSPVKNISWIPDSQTKVFPQDKVIL 

VGSDITFCCVSQEKVLSALIGHTOCPLIHLDGE 

NVAIKIRNISVSASSGTNYVFTTEDNIFXjTVIF 

AGYPPDTPQQLNCETHDLKEnCSWNPGRVTA 

L VGPRATS YTLVESFSGK Y VRLKRAEAFTNES 

YQLLFQMLPNQEIYNFTLNAHNPLGRSQSTIL 

VN1TEKVYPHTPTSFKVKDINSTAVKLSWHLP 

GNFAKINFLCEIEIKKSNS VQEQRNVTIKG VE 

NSSYLVALDKLNPYTLYTFRIRCSTETFWKW 

SKWSNKKQHLTTEASPSKGPDTWREWSSDG 

KNLHYWKPLPINEANGKILSYNVSCSSDEErQ 

SLSEIPDFQHKAEIRLDKNDYUSVVAKNSVGS 

SPPSKIASMEDPNDDLKIEQVVGMGKGILLTW 

HYDPNMTCDYVIKWCNSSRSEPCLMDWRKV 

PSNSTETVIESDEFRPGIRYNFFLYGCRNQGY 

QLLRSMIGY1EELAPIVAPNFTVEDTSADSILV 

KWEDIPVEELRGFLRGYLFYFGBCGERDTSKM 

RVLESGRSD1KVKNTTDISQKTLRIADLQGKTS 

YHLVUIAYTDGGVGPEKSMYVVTKENSVGL 

1IAILIPVAVAVIVGVVTS1LCYRKREWIKETFY 

PDIPNPENCKALQFQKSVCEGSSALKTLEMNP 

CTPNNVEVLETRSAFPKIEDTEIVSPVAERPEN 

RSDAKPENHWESYCPPIIEEEIPNPAADETGG 

TAQVrYIDVQSMYQPQAKPEEEQENDPVGGA 

GYKPQMHLPrNSTVEDIAAEEDLDKTAGYRP 

QANVNTWNLVSPDSrRSIDSNSEIVSFGSPCSI 

NSRQFL1PPKDEDSPKSNGGGWSFTNFFQNKP 

ND 


1341 


2691 


A 


10392 . 


1 


5057 


MLPPKHLSATKPKKSWAFNLYELDSDLTKEP 

DVnGEGPTDSEFFHQRFRNLTYVEFVGPRKTL 

KLR^CLDWLQPETOTKEEIIELLVLEQYLTII 

PEKLBCPWVRAKKPENCEKLVTLLENYKEMY 

QPEGESLHGVLWSAGLRCPLGLSASTLLTW 

SGLDNSLSWAAVGMSCVLWDIELHHDFLGV 

ATKSVSTHAQGDAAQGLGG1WRMWARDSN 

LATGVLLDDNNSDVTSDDDMTRNRRESSPPH 

SVHSFSGDRDWDRRGRSRDTEPRDRWSIITR 

NPRSRMPPRDLSLP WAKTSFEMDREDDRD S 

RAYESRSQDAESYQNWDLAEDRKPHNTIQD 

NMENYRKLLSLGVQLAEDDGHSHMTQGHSS 

RSKRSAYPSTSRGLKTMPEAKKSTHRRGICED 

ESSHGVIMEKFIKDVSRSSKSGRARESSDRSQ 

RFPRMSDDNWKDISLNKRESVIQQRVYEGNA 

FRGGFRFNSTLVSRKRVLERKJRRYHFDTDGK 

GS1HDQKGCPRKKPFECGSEMRKAMSVSSLS 

SLSSPSFTESQPIDFGAMPYVCDECGRSFSVIS 

EFVEHQIMHTRENLYEYGESFIHSVAVSEVQK 
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SEQID 
NO: of 
nucl- 
eotide 
seq- 
uence 


SEQID 
NO: of 
peptide 
seq- 
uence 


Met 
hod 


SEQ 
ID NO: 
in 

USSN 
09/496 
914 


Predicted ( 

beginning 

nucleotide 

location 

correspond! 

ng to first 

amino acid 

residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue 
of peptide 
sequence 


Amino acid sequence (A«Alanine C=Cysteine, 
D=Aspartic Acid, E=Glutamic Acid, 
F=Phenyialanine, OOlyclne, H=Histidine, 
Hsoleucine, K^Lysine, LpLeucine, 
M«Methionine, N=Asparagine, P=Proline, 
Q»Glutamine, R»Arginine, S^Serine, 
^Threonine, V^Valinc, W-Tryptophan, 
Y=Tyrosine, X=Unknown, *=Stop codon, 
/^possible nucleotide deletion, V=possible 
nucleotide insertion 














SQVGGKIOTCKI)CGin*FNKSAALAEHRKIHA 

RGVLVEC^QECEEAI^ 

DKFYECRVCKETFLHSSALIEHQKIHFGDDKD 

NEREHERERERERGETFRPSPALNEFQKMYG 

KEKMYECKVCGETTLHSSSLKEHQKIHTRGN 

PFENKGKVCEETFIPGQSLKRRQKTyNKEKLC 

DFTIX3RDAFMQSSELSEHQKIHSRKNLFEGR 

GYEKSVIHSGPFTESQKSHT1TRPLESDEDEKA 

FTTSSNPYENQKIPTKENVYEAKSYERSVIHSL 

ASVEAQKSHSVAGPSKPKVMAESTIQSFDAIN 

HQRVRAGGNTSEGREYSRSVIHSjLVASKPPRS 

HNGNELVESNEKGESSIYISDLNDKRQKIPAR 

ENPCEGGSKNRNYEDSVIQSWRAKPQKSVP 

GEGSGEFKKDGEFSVPSSNVREYQKARAKKK 

YIEHRSNETSV1HSLPFGEQTFRPRGMLYECQ 

ECGECFAHSSDL-reHQKIHDREKPSGSRNYE 

WSVIRSLAPTDPQTSYAQEQYAKEQARNKCK 

DFRQFFATSEDLNTNQKIYDQEKSHGEESQGE 

NTDGEETHSEETHGQETIEDPVIQGSDMEDPQ 

KDDPDDKIYECEDCGLGFVDLTDLTDHQKVH 

SRKCLVDSREYTHSVIHTHSISEYQRDYTGEQ 

LYECPKCGESFIHSSFLFEHQRJHEQDQLYSM 

KGCDDGFIALLPMKPRRKRAAERNPALAGSA 

IRCLLCGQGFIHSSALNEHMRLHREDDLUEQS 

QMAEEAI1PGLALTEFQRSQTEERLFECAVCG 

ESFVNPAELADHVTVHKNEPYEYGSSYTHTS 

FLTEPLKGAIPFYECKDCGKSFIHSTVLTKHKE 

LHLEEEEEDEAAAAAAAAAQEVEANVHVPQ 

WLRIQGLNVEAAEPEVEAAEPEVEAAEPEV 

EAAEPNGEAEGPDGEAAEPIGEAGQPNGEAE 

QPNGDADEPDGAGIEDPEERAEEPEGKAEEPE 

GDADEPDGVGIEDPEEGEDQEIQVEEPYYDC 

HECTETFT S STAFSEHLKTHASMIIFEPANAFG 

ECSGYIERASTSTGGANQADEKYFKCDVCGQ 

LFNDHL SL ARHQNTHTG 


1342 


2692 


A 


10393 


2 


1350 


GRPRSSSDNRNFLRERAGLSSAAVQTRIGNSA 

ASRRSPAARPPVPAPPALPRGRPGTEGSTSLS 

APAVLWAVAWVVWSAVAWAMANY1HV 

PPGSPEVPKLNVTVQDQEEHRCREGALSLLQ 

HIJIPHWDPQEVTLQLFTO 

tmedwlvriygnktellvdrdeevksfrvl 

qaiigcapqlyci™nglcyefiqgealdpkh 

vcnpaifrliarqlakjhaihahngwipksnl 

wlkmgkyfsliptgfadedinkrflsdipssqi 

lqeemtwmkeilsnlgspvvlchndllcknii 

ynekqgdvqfidyeysgynylaydignhfne 

fagvsdvdyslypdrelqsqwlrayleayk 

firKUrUlxjV llilSJlVilJJLrlvYXNyrAljAorlrr YV 

GLWALIQAKYSTIEFDFLGYAIVRFNQYFKM 
KPEVTALKVPE 


1343 


2693 


A 


10394 


102 


839 


PEAQTSAVLAREKGHLPTMRHEAPMQMASA 

QDARYGQKDSSDQNFDYMFKLLUGNSSVGK 

TSFLFRY.ADDSFTSAFVSTVGIDFKVKTVFKN 

EKRIKLQIWDTAGQERYRHTTAYYRGAMGFI 

LMYDITNEESFNAVQDWSTQIKTYSWDNAQ 

VILVGNKCDMEDERVISTERGQHLGEQLGFE 

FFETSAKDNINVKQTFERLVDnCDKMSESLET 

DPAITAAKQNTRLKETPPPPQPNCAC 


1344 


2694 


A 


10395 


2 


4136 


DRPPWNSRVDDFVTNLIHLSSKGHISPAKDTS 
LQQRTPAEMSPVLHFYVRPSGHEGAASGHTR 
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SEQ ID 
NO: of 
nucl- 
eotide 
seq- 
uence 


SEQ ID 
NO: of 
peptide 
seq- 
uence 


Met 
hod 


SEQ 
ID NO: 
in 

USSN 
09/496 
914 


Predicted 
beginning 
nucleotide 
location 
correspond i 
ng to first 
amino acid 
residue of 
peptide 
sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue 
of peptide 
sequence 


Amino acid sequence (A**A)anine OCysteine. 
D=Aspartic Acid, E=0!utamic Acid, 
^Phenylalanine. G=Glyctne, H=Histidine, 
I=Isoteucine, K=Lysine, L=Leucine, 
M=Methionine, N=Asparagine, P=Proline, 
Q=Glutamine, R=*Arginine, S«Serine, 
T-Threonine, V-Vaiine, W=Tryptophan, 
Y=Tyrosine, X=Unknown, *=Stop codon, 
possible nucleotide deletion, V=possibIe 
nucleotide insertion 














MCLQGKLPELQGVETELCYNVNWTAEALPSA 

EETKKLMWLFGCPLLLDD VARES WLLPOSN 

DLLLEVGPRLNFSTPTSTNIVSVCRATOLGPV 

DRVETTRRYRLSFAHPPSAEVEAIALATLHDR 

MTEQHFPHPIQSFSPESMPEPLNGP1MLGEGR 

LALEKANQELGLALDSWDLDFYTKRFQELQR 

NPSTVEAFDLAQSNSEHSRHWFFKGQLHVDG 

QKLVIISLFESIMSTQESSNPNNVLKFCDNSSA 

IQGKEVRFLRPEDPTRPSRFQQQQGLRHWFT 

AETHNFPTGVCPFSGATTGTGGRIRDVQCTG 

RGAHWAGTAGVCFGNLHIPGYNLPWEDLSF 

QYPGNFARPLEVAIEASNGASDYGNKFGEPV 

LAGFARSLGLQLPDGQRREWIKPIMFSGGIGS 

MEADHISKEAPEPGMEWKVGGPVYRIGVGG 

GAASSVQVQGDNTSDLDFGAVQRGDPEMEQ 

KMNRVIRACVEAPKGNPICSLHDQGAGGNG 

NVLKELSDPAGAIIYTSRFQLGDPTLNALEIW 

GAEYQESNALLLRSPNRDFLTHVSARERCPA 

CFVGTTTGDRRIVLVDDRECPVRRNGQGDAP 

PTPPPTPVDLELEWVLGKMPRKEFFLQRKPP 

MLQPLALPPGLSVHQALERVLRLPAVASKRY 

LTNKVDRSVGGLVAQQQCVGPLQTPLADVA 

WALSHEEL1GAATALGEQPVKSLLDPKVAA 

RJLAVAEALTNLVFALVTDLRDVKCSGNWM 

WAAKLPGEGAALADACEAMVAVMAALGVA 

VDGGKDSLSMAARVGTETVRAPGSLVISAYA 

VCPDITATVTPDLKHPEGRGHLLYVALSPGQ 

HRLGGTALAQCFSQLGEHPPDLDLPENLVRA 

FSrTQGLLKDRLLCSGHDVSDGGLVTCLLEM 

AFAGNCGLQVDVPVPRVDVLSVLFAEEPGLV 

LEVQEPDLAQVLKRYRDAGLHCLELGHTGE 

AGPHAMVRVSVNGAWLEEPVGELRALWEE 

TSFQLDRLQAEPRCVAEEERGLRERMGPSYC 

LPPTFPKASVPREPGGPSPRVAILREEGSNGDR 

EMADAFHLAGFEVWDVTMQDLCSGAIGLDT 

FRGVAFVGGFSYADVLGSAKGWAAAVTFHP 

RAGAELRRFRKRPDTFSLGVCNGCQLLALLG 

WVGGDPNEDAAEMGPDSQPARPGLLLRHNL 

SGRYESRWASVRVGPGPALMLRGMEOAVLP 

VWSAHGEGYVAFSSPELQAQIEARGLAPLHW 

ADDDGNPTEQYPLNPNGSPGGVAGICSCDGR 

HLAVMPHPERAVRPWQWAWRPPPFDTLTTS 

PWLQLFINARNWTLEGSC 


1345 


2695 


A 


10396 


65 


642 


GVRGFWAGTMASRAGPRAAGTDGSDFQHRE 
RVAMHY QMS VTLKYEIKKUYVHLVIWLLL V 
AKMSVGHLRLLSHDQVAMPYQWEYPYLLSI 
LPSLLGLLSFPRNNISYLVLSMISMGLFSIAPLI 
YGSMEMFPAAQQLYRHGKAYRFLFGFSAVSI 
MYLVLVLAVQVHAWQLYYSKKLLDSWFTST 
QEKKHK 






A 


10398 


1 


718 


DDFVRCGPQSAAMGASARLLRAVIMGAPGS 

GKGTVSSRITTHFELKHLSSGDLLRDNMLRGT 

EIGVLAKAFIDQGKLIPDDVMTRLALHELKNL 

TQYSWLLDGFPRTLPQAEALDRAYQIDTVINL 

NVPFEVIKQRLTARWIHPASGRVYNIEFNPPK 

TVGIDDLTGEPLIQREDDKPETVIKRLKAYED 

QTKPVLEYYQKKGVLETFSGTETNKIWPYVY 

AFLQTKVPQRSQKASVTP 


1347 


2697 


A 


10402 


153 


1969 


KHRQENNALDMAPEIHMTGPMCJLIENTNGEL 
VANPEALIGLSAITQPWWAIVGLYRTGKSY 



333 



WO 01/57188 PC17US01/0380G 



SEQID 
NO: of 
nucl- 
eotide 
seq- 
uence 


SEQID 
NO: of 
peptide 
seq- 
uence 


Met 
hod 


SEQ 
ID NO 
in 

USSN 
09/496 
914 


Predicted 

beginning 

nucleotide 

location 

correspond! 

ng to first 

amino acid 

residue of 

peptide 

sequence 


Predicted end 
nucleotide 
location 
corresponding 
to last amino 
acid residue 
of peptide 
sequence 


Amino acid sequence (A~Alanine OCysteine, 
D^Aspartic Acid, E-431utamic Acid, 
F=Phenylalanine, G=Glycine, H=Histidine, 
I=Isoleucine, K^Lysine, L=Leucine, 
M=Methionine, N=Asparagine, P=Proline, 
Q=Glutamine, R»Arginine, S«Serine, 
T»Threonine, V-Valine, W-Tryptophan, 
Y-Tyrosbc, X«Unknown, *«3top codon, 
/^possible nucleotide deletion, \=possibIe 
nucleotide insertion 














LMNKLAGKNKGFSLGSTXTCSHTKGIWMWCV 

PHPKKPEHTLVLLDTEGLGDVKKGDNQNDS 

WIFTUVaSSTLVYNSMGTINQQAMDQLYY 

VTELTHRIRSKSSPDENENEDSADFVSFFPDFV 

WTLRDFSLDLEADGQPLTPDEYLEYSUCLTQ 

GTSQKDKNFNLPRLCIRKFFPKKKCFVFDLPI 

HRRKLAQLEKLQDEELDPEFVQQVADFCSYI 

FSNSKTKTLSGGIKVNGPRLESLVLTYINAISR 

GDLPCMENAVLALAQIENSAAVQKAIAHYD 

QQMGQKVQLPAETLQELLDLHRVSEREATEV 

YMKNSFKDVDHLFQKiaAAQLDKKRDDFCK 

QNQEASSDRCSALLQV1FSPLEEEVKAGIYSK 

PGGYCLFIQKLQDLEKKYYEBPRKGIQAEEIL 

QTYLKSKESVTDAILQTDQILTEKEKEIEVEC 

VKAESAQASAKMVEEMQIKYQQMMEEKEKS 

YQEIIVKQLTEKMERERAQLLEEQEKTLTSKL 

QEQARVLKERCQGESTQLQNEIQKLQKTLKK 

KTKRYMSHKLKI 


1348 


2698 


A 


10404 


5 


892 


TQLPAPLSGVLSRLQLGSGAPLLTWVQETAG 

VAGGAPRRRTPVTMWRLLARASAPLLRVPLS 

DSWALLPASAGVKTLLPVPSFEDVSIPEKPKL 

RFIERAPLVPKVRREPKNLSDIRGPSTEATEFT 

EGNFACLALGGGYLHWGHFEMMRLTINRSM 

DPKNMFAIWRVPAPFKP1TRKSVGHRMGGGK 

GAIDHYVTPVKAGRLVVEMGGRCEFEEVQG 

FLDQVAHKLPFAAKAVSRGTLEKMRKDQEE 

RERNNQNPWTFERIATANMLG1RKYLSPYDL 

THKGKYWGKFYMPKRV 


1349 


2699 


A 


10409 


59 


1184 


LRRNCSALGGLFQT1ISDMKGSYPVWEDFLNK 

AGKLQSQLRTTVVAAAAFLDAFQKVADMAT 

NTRGGTREIGSALTRMCMRHRSIEAKLRQFSS 

ALIDCLIKPLQEQMEEWKKVANQLDKDHAK 

EYKKARQEIKKKSSDTLKLQKKAKKGRGDIQ 

FQLDSALQDVNDKYLLLEETEKQAVRKALIE 

ERGRFCTFISMLRPVIEEEISMLGEITHLQTISE 

DLKSLTMDPHKLPSSSEQVILDLKGSDYSWS 

YQTPPSSPSTTMSRKSSVCSSLNSVNSSDSRSS 

GSHSHSPSSHYRYRSSNLAQQAPVRLSSVSSH 

DSGFISQDAFQSKSPSPMPPEAPNQRRKEKRE 

PDPNGGGPTTASGPPAAAEEAQRPRSM 


1350 


2700 


A 


10410 


511 


958 


AGRGGPGKPVSWSSGPGSPGQTQRRSWVKST 

RGHSSLLPPSQDFVAGLSVILRGTVDDRLNW 

AFNLYDLNKDGC1TKEEMLDIMKSIYDMMG 

KYTYPALREEAPREHVESFFQKMDRNKDGV 

VTIEEFIESCQKDENIMRSMQLFDNVI 
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WHAT IS CLAIMED IS: 

1. An isolated polynucleotide comprising a nucleotide sequence selected from the group 
consisting ofSEQ ID NO: 1-1350, a mature protein coding portion of SEQ ID NO: 1-1350, an 
active domain of SEQ ID NO: 1-1350, and complementary sequences thereof. 

2. An isolated polynucleotide encoding a polypeptide with biological activity, wherein said 
polynucleotide hybridizes to the polynucleotide of claim 1 under stringent hybridization 
conditions. 

3. An isolated polynucleotide encoding a polypeptide with biological activity, wherein said 
polynucleotide has greater than about 90% sequence identity with the polynucleotide of claim L 

4. The polynucleotide of claim 1 wherein said polynucleotide is DNA 

5. An isolated polynucleotide of claim 1 wherein said polynucleotide comprises the 
complementary sequences. 

6. A vector comprising the polynucleotide of claim 1 . 

7. An expression vector comprising the polynucleotide of claim 1 . 

8. A host cell genetically engineered to comprise the polynucleotide of claim 1 . 

9. A host cell genetically engineered to comprise the polynucleotide of claim 1 operatively 
associated with a regulatory sequence that modulates expression of the polynucleotide in the host 
cell. 

10. An isolated polypeptide, wherein the polypeptide is selected from the group consisting of: 

(a) a polypeptide encoded by any one of the polynucleotides of claim 1 ; and 

(b) a polypeptide encoded by a polynucleotide hybridizing under stringent conditions 
with any one of SEQ ED NO: 1-1350. 

11. A composition comprising the polypeptide of claim 1 0 and a carrier. 

12. An antibody directed against the polypeptide of claim 10. 
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13. A method for detecting the polynucleotide of claim 1 in a sample, comprising: 

a) contacting the sample with a compound that binds to and forms a complex 
with the polynucleotide of claim 1 for a period sufficient to form the complex; and 

b) detecting the complex, so that if a complex is detected, the polynucleotide 
of claim 1 is detected. 

14. A method for detecting the polynucleotide of claim 1 in a sample, comprising: 

a) contacting the sample under stringent hybridization conditions with 
nucleic acid primers that anneal to the polynucleotide of claim 1 under such conditions; 

b) amplifying a product comprising at least a portion of the polynucleotide of 

claim l;and 

c) detecting said product and thereby the polynucleotide of claim 1 in the 

sample. 

1 5. The method of claim 14, wherein the polynucleotide is an KNA molecule and the method 
further comprises reverse transcribing an annealed RNA molecule into a cDNA polynucleotide. 

16. A method for detecting the polypeptide of claim 10 in a sample, comprising: 

a) contacting the sample with a compound that binds to and forms a complex 
with the polypeptide under conditions and for a period sufficient to form the complex; and 

b) detecting formation of the complex, so that if a complex formation is 
detected, the polypeptide of claim 1 0 is detected. 



1 7. A method for identifying a compound that binds to the polypeptide of claim 1 0, 
comprising: 

a) contacting the compound with the polypeptide of claim 10 under 
conditions sufficient to form a polypeptide/compound complex; and 

b) detecting the complex, so that if the polypeptide/compound complex is 
detected, a compound that binds to the polypeptide of claim 10 is identified. 

18, A method for identifying a compound that binds to the polypeptide of claim 1 0, 
comprising: 
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a) contacting the compound with the polypeptide of claim 10, in a cell, under 
conditions sufficient to form a polypeptide/compound complex, wherein the complex drives 
expression of a reporter gene sequence in the cell; and 

b) detecting the complex by detecting reporter gene sequence expression, so 
that if the polypeptide/compound complex is detected, a compound that binds to the polypeptide 
of claim 1 0 is identified. 

19. A method of producing the polypeptide of claim 10, comprising, 

a) culturing a host cell comprising a polynucleotide sequence selected from 
the group consisting of a polynucleotide sequence ofSEQ ID NO: 1-1350, a mature protein 
codingportionofSEQIDNO: 1-1350, an active domain of SEQ ID NO: 1-1350, 
complementary sequences thereof and a polynucleotide sequence hybridizing under stringent 
conditions to SEQ ID NO: 1-1350, under conditions sufficient to express the polypeptide in said 
cell; and 

b) isolating the polypeptide from the cell culture or cells of step (a). 

20. An isolated polypeptide comprising an amino acid sequence selected from the group 
consisting of SEQ ID NO: 1351-2700, the mature protein portion thereof, or the active domain 
thereof. 

2 1 . The polypeptide of claim 20 wherein the polypeptide is provided on a polypeptide array. 

22. A collection of polynucleotides, wherein the collection comprises the sequence 
information of at least one of SEQ ID NO: 1-1350. 

23 . The collection of claim 22, wherein the collection is provided on a nucleic acid array. 

24. The collection of claim 23, wherein the array detects full-matches to any one of the 
polynucleotides in the collection. 

25. The collection of claim 23, wherein the anay detects mismatches to any one of the 
polynucleotides in the collection. 

26. The collection of claim 22, wherein the collection is provided in a computer-readable 
format 
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Pages 340 to 1963 of this application contain amino acid sequence listings. 
They can be obtained at the address given below. 

Les pages 340 to 1963 de cette demande contiennent des listages des sequences 
d'acides amines. Elles peuvent Stre obtenues a I'adresse indiquSe ci-dessous. 



World Intellectual Property Organization 
34, chemin des Colombettes 
CH-1211Gen6ve20 
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27. A method of treatment comprising administering to a mammalian subject in need thereof 
a therapeutic amount of a composition comprising a polypeptide of claim 10 or 20 and a 
pharmaceutical^ acceptable carrier. 

28. A method of treatment comprising administering to a mammalian subject in need thereof 
a therapeutic amount of a composition comprising an antibody that specifically binds to a 
polypeptide of claim 10 or 20 and a pharmaceutically acceptable carrier. 
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